• 1st group models:
a linear regression model with polyploidization and gene editing iterations
The first step of our experiment design is to induce genome duplication from diploid rice to tetraploid. We conducted linear regression analysis using the existing data on the difference in protein content between diploid and tetraploid rice in our research group to predict the potential of this experimental design.
Figure 1 Linear regression analysis of changes and trends in total protein, component proteins, and lysine content after polyploidization
The prediction results showed that total protein, glutelin and lysine increased significantly, gliadin increased significantly, and globulin did not increase significantly.Among them, the growth of total protein, lysine and gliadin is considerable, but glutelin is contrary to our expectations.Although there are flaws in the experimental design, overall it is considerable.
However, the problem that polyploidization does not selectively lead to the increase of glutelin content due to the increase of metabolite content remains to be solved. The second step of our experiment is to knock out the glutelin coding gene osglub1 in tetraploid rice using crispr/cas9 gene editing technology.(See Wet-Lab-Design for details) We also conducted linear regression analysis on the existing data about the protein content of the GluB1 mutant rice in our research group to predict the potential of this experimental design.
Figure 2 Linear regression analysis of changes and trends in total protein, component proteins, and lysine content after OsGluB1 gene mutation
Unfortunately, the model predicts that this experiment will lead to poor results. The increase in total protein and prolamine content was not significant, while the lysine content unexpectedly decreased. This reminds us to carefully analyze the failures of this experimental design and correct them. (Defect details can be found in Wet-Lab-Design )
Inspired by further literature review and suggestions from stakeholders (for details, see human practices - Integrated Human Practices - Market Research (offline & face to face) ), we will pay attention to globin. The experimental design of this step is modified to use crispr/cas9 gene editing technology to knock out glutelin coding gene osglub1 and globin coding gene osglb in tetraploid rice.(See Wet-Lab-Design for details) . We also conducted linear regression analysis on the protein content data of rice with OsGluB1 and OsGlb dual gene mutations that our research group already had, and predicted the potential of the improved experimental design in this step.
Figure 3 Linear regression analysis of changes and trends. Total protein, component proteins, and lysine content after dual mutation of OsGluB1 and OsGlb genes
Excitingly, the model prediction results of the improved experimental design are highly consistent with our goals. The content of total protein, prolamine, and lysine significantly increased, while the content of glutelin and globulin significantly decreased.
Therefore, we have determined the experimental design:
STEP1. Inducing the genome doubling of diploid rice to polyploidize it into tetraploid
STEP2. Knockout OsGluB1 and OsGlb, using CRISPR/Cas9 gene editing technology
• 2nd group model:
Linear regression model for predicting and comparing the effects of different sequences of polyploidization and gene editing
We want to get rice with improved total protein, gliadin and lysine content, basically unchanged glutelin content, and almost removed globulin nutrition, taste quality and safety.(Defect details can be found in Project-Description )We have preliminarily determined that the technical route of polyploidization followed by gene editing in the project, which is aimed at improving the nutritional quality of rice while reducing the adverse effects on its agronomic traits. However, when the project was established, some scholars questioned the difficulty of performing homozygous gene editing on tetraploids, wondering why we did not choose to first perform gene editing in a diploid background, screening and identifying the homozygotes, and then performing polyploidization lastly, which would greatly reduce the difficulty. We consulted our second PI, Dr. Lu Gan, who told us that as a higher eukaryotic organism, rice has a much higher complexity in genome structure and gene regulation than microorganisms. We cannot easily give up the advantage of increasing the biomass of tetraploid chassis when we are unsure whether carry out the step of polyploidization or gene editing first achieves the most significant protein nutrition improvement effect. In order to improve breeding efficiency, we organized and conducted linear regression analysis on the research data of rice polyploidization and CRISPR/Cas9 gene editing to improve protein nutritional quality in our research group. We predicted the effect of polyploidization followed by gene editing or gene editing followed by polyploidization, compared and analyzed which one better met our expectations, and evaluated the necessity of using the sequence of gene editing followed by polyploidization to simplify the experiment.
Figure 4: Changes in Total Protein Content (Prediction)
Polyploidization+gene editing Gene editing+polyploidization
As shown in the figure, the increase in total protein content of rice obtained by sequential "gene editing+polyploidization" is greater than that obtained by "gene editing+polyploidization".
Figure 5 glutelin content change (prediction)
Polyploidization+gene editing Gene editing+polyploidization
As shown in the figure, the sequence "gene editing + polyploidization" restored the glutelin content to that of wild-type diploid rice (that is, traditional rice), while the sequence "gene editing + polyploidization" increased the glutelin content instead.
Figure 6 Changes in prolamine content (predicted)
Polyploidization+gene editing Gene editing+polyploidization
As shown in the figure, the increase in prolamine content obtained by sequential "gene editing+polyploidization" is greater than that obtained by "gene editing+polyploidization".
Figure 7 Changes in Globulin Content (Prediction)
Polyploidization+gene editing Gene editing+polyploidization
As shown in the figure, the sequence of "gene editing+polyploidization" successfully removed globulin, while "gene editing+polyploidization" was ultimately ineffective in removing globulin.
Figure 8 Changes in Lysine Content (Prediction)
Polyploidization+gene editing Gene editing+polyploidization
As shown in the figure, the lysine content of rice obtained by the sequence of "gene editing+polyploidization" and "gene editing+polyploidization" is the same.
To sum up, the sequence "gene editing + polyploidization" is better than "gene editing + polyploidization" in improving the nutritional quality of rice (comprehensively considering the changes of total protein and lysine content ), eating quality (considering the gliadin content), food safety (comprehensively considering the changes of glutelin conten and globulin content ).
Therefore, it is not advisable to perform gene editing before polyploidization in order to simplify the experiment, although it is indeed tempting in reducing technical difficulty.
• 3rd group model:
Protein structure (primary structure+tertiary structure) before and after mutation of glutelin and globulin
When we conducted mutation identification on OsGluB1 and OsGlb, we found that the CRSIPR/Cas9 gene editing caused frameshift mutations in them (see Wet-Lab-Result for details), which raised our concerns. Due to frameshift mutations, protein translation reading frames are altered, leading to codon rearrangements after the mutation site and significant changes in amino acid sequence and structure. Although globulin are hardly synthesized and the synthesis of glutelin is limited after mutation, in this case, on one hand we must take measures to predict the protein structure and function after mutation to ensure that it will not produce toxic proteins; on the other hand, we need to predict the function of the mutant protein is better than the wild-type one .Therefore, we predicted the primary and tertiary structures of glutelin and globulin before and after mutation, and analyzed their properties and functions.
Primary structure prediction
The primary structure prediction was completed on the ProtParam platform. The wild-type gene information glutelin and globulin are all from NCBI database.
GlutelinB1 | Globulin | ||||||
---|---|---|---|---|---|---|---|
WT | WT | Mutant-1(50%) | Mutant-2(50%) | Mutant-2(50%) | WT | Mutant | |
Number of amino acids | 499 | 499 | 583 | 583 | 583 | 186 | 185 |
Molecular weight | 56550.58 | 56550.58 | 67178.78 | 67136.70 | 67136.70 | 21054.74 | 20346.4 |
PI | 9.26 | 9.26 | 9.66 | 9.66 | 9.66 | 7.48 | 10.22 |
Total number of negatively charged residues | 39 | 39 | 38 | 38 | 38 | 21 | 20 |
Total number of positively charged residues | 50 | 50 | 70 | 70 | 70 | 22 | 29 |
Instability index | 52.11 | 52.11 | 58.90 | 58.42 | 58.42 | 69.24 | 72.62 |
Aliphatic index | 76.19 | 76.19 | 85.95 | 85.45 | 85.45 | 52.53 | 75.46 |
Grand average of hydropathicity | -0.495 | -0.495 | -0.226 | -0.234 | -0.234 | -0.629 | -0.317 |
Amino acids composition | |||||||
Glutelin-WT | Glutelin-Mutant1 | Glutelin-Mutant2 | Globulin-WT | Globulin-Mutant | |||
Rice | GlutelinB1 - Mutation | Globulin - Mutation |
---|---|---|
Biological Value(BV) | ↑ | ↓ |
taste | Unable to determine | Unable to determine |
digestibility | Unable to determine | Unable to determine |
allergenicity | Unable to determine | Unable to determine |
Changes in Nutritional Value
1、The biological value(BV) of rice is an indicator of its nutritional value in nutritional science, which reflects the efficiency of amino acids in rice being absorbed and utilized by the human body. WHO/FAO has proposed recommended amino acid ratios based on human nutritional needs to facilitate the measurement of biological value, with foods closer to the recommended ratio having higher biological value. Because amino acids are more balanced, digestion and absorption rates, metabolic assimilation rates, and nutrient utilization rates are higher. Due to the low content of globulin in the total protein of rice endosperm, especially after mutation, its amino acid ratio deviates from the WHO recommendation and will not have a significant impact on the biological value of rice.
2、After the mutation of glutelin and globulin, the hydrophilicity of the primary structure (peptide chain) becomes worse, while the interior of the active center pocket of the corresponding digestive enzymes (trypsin and chymotrypsin) is hydrophobic, which is conducive to their binding with the enzyme.Trypsin recognizes and cleaves Lys-Arg, Phe-Tyr, and Trp-Tyr.
3、The protein molecules are larger after mutations of glutelin and globulin.Pepsin, trypsin and chymotrypsin, the three key digestive enzymes of glutelin and globulin, have positive charges in the active center pocket.The Pi of glutelinB1 before and after mutation is greater than the pH value of gastric juice and small intestinal juice, resulting in its positive charge in the stomach and small intestine, especially in the stomach.The wild-type Pi of globulin fluctuates within the pH range of small intestinal fluid, but is higher than the pH value of gastric acid. After mutation, the Pi is higher than the pH values of gastric acid and small intestinal fluid. The efficiency of digestive enzymes in decomposing these two proteins decreases, which may be unfriendly to whom has digestive system diseases, especially gastric disease patients. Due to the fact that the protein with the highest lysine content and the most balanced amino acid ratio in rice protein is not one of these two types, the loss of digestive performance and nutritional value caused by their mutation can be compensated for by other proteins.
Taste changes
1、According to the primary structure prediction, both glutelin and globulin hide hydrophobic residues in the interior of the protein and expose hydrophilic residues on the surface of the protein through peptide chain folding. The following tertiary structure prediction will verify this.During the cooking process, they gradually denature from hydrophilic proteins to hydrophobic peptide chains.
2、The solubility and emulsifying properties of rice protein during the cooking stage have a comprehensive impact on the taste of rice. During the cooking process of rice, in the first few minutes of low-temperature soaking, the protein has good hydrophilicity and fully absorbs water, transitioning from a dry state to a wet state; In the middle heating stage, the rice is nearly cooked, the protein structure becomes loose, hydrophobic residues are gradually exposed, and protein emulsification begins to occur, and protein and fat form a stable lotion; In the later stage of rice water harvesting, the hydrophobic residues of protein are almost completely exposed, and the emulsification of rice protein reaches the peak to form a stable lotion system. Adequate protein absorption and emulsification are necessary to make the taste of rice softer and more delicate. Therefore, from a structural perspective, it is necessary to meet the requirement of having a high content of hydrophobic residues in the primary structure and hiding them as much as possible inside the protein in the tertiary structure, in order to ensure sufficient emulsification without interfering with the early water absorption process. The primary structure of glutelin and globulin has met the above conditions, but whether the tertiary structure meets the requirements needs further prediction through modeling.
Changes in food safety
The allergenicity of rice containing glutelin and globulin is related to the primary and tertiary structure of protein.Antigen epitopes include linear epitopes and conformational dependent epitopes, and protein folding and denaturation can affect antigen epitopes. Therefore, the allergenic changes of glutelin and globulin can be judged by comparative analysis of the primary and tertiary structure epitopes of wild-type and mutant glutenin.The expected result is a reduction in the number of mutant antigen epitopes, with the proportion of linear epitopes remaining as high as possible and the proportion of conformational dependent epitopes as small as possible. This can reduce the difficulty of preventing and controlling allergy risks and improve food safety from the perspectives of detection and regulation.
Third level structure prediction
This round of modeling is to supplement the analysis of the predicted results of the primary structure and answer the following questions.
1、Is the hydrophobic residue of glutelin B1 and globulin more hidden inside the protein after mutation (proportion)?
Using Alphafold3 to predict protein tertiary structure and perform confidence scoring
Figure 9 prediction of tertiary structure of glutelin B1 mutant (high confidence)
Figure 10 Prediction of Tertiary Structure of Globulin Mutations (High Confidence)
We used the protein tertiary structure visualization software Pymol to process the structures of wild-type and two mutant types of gluten B1. For the convenience of differentiation, we unified the hydrophilic residues as pink and stained the hydrophobic residues Ala, Val, Leu, Ile, Phe with Cyan series colors. And open the sticks of hydrophobic residues in surface display mode.
Figure 11 hydrophilic / hydrophobic changes of glutelinB1 before and after mutation
Figure 12 Changes in hydrophilicity/hydrophobicity before and after globulin mutation
As shown in the figure, hydrophobic residues in their respective mutants (Figure 11B, C, figure 12b) are significantly more clustered in the interior of the protein than glutelin B1 and globulin wild-type (Figure 11, 12a). Therefore, it verifies our speculation on the improvement of food quality after gluten B1 mutation in the primary structure analysis.
• 4th group of models:
molecular docking of glutelin, globulin wild-type and mutant with human immunoglobulin E
This round of modeling is to supplement the analysis of the predicted results of the primary structure and answer the following questions. (See the table below)
1、Changes in the number of antigenic epitopes after mutations in glutelin B1 and globulin?
2、Does the proportion of linear epitopes increase after mutations in glutelin B1 and globulin?
Figure 13 modeling results for molecular docking between wild-type and mutant glutelin B1 and Fc region of human immunoglobulin E
Figure 14 Molecular docking modeling results of wild-type and mutant immunoglobulin E with human immunoglobulin E in the fc region
GlutelinB1-WT | GlutelinB1-M1 | GlutelinB1-M2 | Globulin-WT | Globulin-M | |
---|---|---|---|---|---|
Number of ionic bonds | 9 | 7 | 7 | 7 | 0 |
Number of hydrogen bonds | 10 | 11 | 11 | 20 | 9 |
Number of identification bits | 8 | 8 | 8 | 14 | 9 |
Total number of recognition site residues | 8 | 9 | 9 | 16 | 9 |
The adjacent residues make up the recognition site frequency | 25% | 62.5% | 62.5% | 25% | 0 |
analysis | The binding of glutelin B1 to IgE decreased, but not significantly | The binding of globulin to IgE decreased significantly | |||
Speculation | The allergenicity of glutelin B1 decreases but is not obvious, but the proportion of linear epitopes will increase, making it easier to prevent and control. | The allergenicity of globulin decreased significantly, and the proportion of linear epitopes would decrease, but it was relatively insignificant. |
Table 2 Analysis of Molecular Docking Results for Glutinous Protein B1 and Globulin Allergy
As the cases of glutelin intolerance or allergy are not as common as globulin, we can improve food safety by strengthening prevention and control.However, cases of globulin intolerance and even allergy are very common in children with high protein food intolerance. Therefore, directly removing allergens and reducing rice protein allergy is more direct and effective than strengthening prevention and control. In conclusion, we predict that mutations in glutelin B1 and globulin genes can effectively improve the nutritional quality, food safety and eating quality of rice, but may be unfriendly to patients with digestive system diseases. In the future, they need to carefully consider whether they are suitable for eating our products under the advice of doctors.
Reference
[1] Kynurenine formamidase: determination of primary structure and modeling-based prediction of tertiary structure and catalytic triad - ScienceDirect
[2] Kynurenine formamidase: determination of primary structure and modeling-based prediction of tertiary structure and catalytic triad - ScienceDirect
[3] Interaction between rice protein and soybean 11S globulin: Effect on the characteristics of rice dough - ScienceDirect
[4] Comparative DNA-methylome and transcriptome analysis reveals heterosis- and polyploidy-associated epigenetic changes in rice - ScienceDirect
[5] Efficient breeding of low glutelin content rice germplasm by simultaneous editing multiple glutelin genes via CRISPR/Cas9 - ScienceDirect
[6] Efficient breeding of low glutelin content rice germplasm by simultaneous editing multiple glutelin genes via CRISPR/Cas9 - ScienceDirect