Overview
The primary mechanism by which thaumatin induces sweetness is through its binding to the human sweet taste receptor, a heterodimer composed of TAS1R2 and TAS1R3 , which subsequently triggers a signal that is perceived as sweetness. In essence, all compounds capable of eliciting a sweet taste in humans activate the TAS1R2/TAS1R3 heterodimer. However, unlike small sugar molecules, thaumatin binds to a distinct site on the receptor.
To facilitate research on the mechanism by which thaumatin induces sweetness, we aim to model the interaction between thaumatin and the human sweet taste receptor. Furthermore, we will employ directed evolution techniques to refine and enhance the properties of thaumatin.
PPI model
Protein-protein interactions (PPIs) refer to the process in which two or more proteins bind together, typically to perform their biochemical functions. In cells, numerous protein components form molecular machines that carry out most essential molecular processes, such as DNA replication, through protein interactions[1]. By constructing a PPI model, we can explore the interaction between thaumatin and the sweet taste receptor in vivo, as well as identify the residues involved in docking. This will facilitate subsequent mutations of relevant residues to improve the binding efficiency between thaumatin and the sweet taste receptor.
Currently, a wide range of models have been developed for PPI simulations, which can be broadly categorized into two approaches: sequence-based methods and structure-based methods. Therefore, before proceeding with PPI simulations, we must first prepare the necessary sequence and structural data.
Structure simulation
Sequence Data
The primary source of protein sequence data is from databases. Below, I have listed the sources and IDs of the structural data for thaumatin and the sweet taste receptor that we used in this project, as shown in the table below:
Protein Name | ID | Sequence |
---|---|---|
Thaumatin II | PDB_ID: 3WOU_A | Show More |
TAS1R2 | UniProtKB/Swiss-Prot: Q8TE23.2 | Show More |
TAS1R3 | UniProtKB/Swiss-Prot: Q7RTX0 | Show More |
ATFEIVNRCSYTVWAAASKGDAALDAGGRQLNSGESWTINVEPGTKGGKIWARTDCYFDDSGRGICRTGDCGGLLQCKRFGRPPTTLAEFSLNQYGKDYIDISNIKGFNVPMDFSPTTRGCRGVRCAADIVGQCPAKLKAPGGGCNDACTVFQTSEYCCTTGKCGPTEYSRFFKRLCPDAFSYVLDKPTTVTCPGSSNYRVTFCPTA
MGPRAKTISSLFFLLWVLAEPAENSDFYLPGDYLLGGLFSLHANMKGIVHLNFLQVPMCKEYEVKVIGYNLMQAMRFAVEEINNDSSLLPGVLLGYEIVDVCYISNNVQPVLYFLAHEDNLLPIQEDYSNYISRVVAVIGPDNSESVMTVANFLSLFLLPQITYSAISDELRDKVRFPALLRTTPSADHHIEAMVQLMLHFRWNWIIVLVSSDTYGRDNGQLLGERVARRDICIAFQETLPTLQPNQNMTSEERQRLVTIVDKLQQSTARVVVVFSPDLTLYHFFNEVLRQNFTGAVWIASESWAIDPVLHNLTELRHLGTFLGITIQSVPIPGFSEFREWGPQAGPPPLSRTSQSYTCNQECDNCLNATLSFNTILRLSGERVVYSVYSAVYAVAHALHSLLGCDKSTCTKRVVYPWQLLEEIWKVNFTLLDHQIFFDPQGDVALHLEIVQWQWDRSQNPFQSVASYYPLQRQLKNIQDISWHTINNTIPMSMCSKRCQSGQKKKPVGIHVCCFECIDCLPGTFLNHTEDEYECQACPNNEWSYQSETSCFKRQLVFLEWHEAPTIAVALLAALGFLSTLAILVIFWRHFQTPIVRSAGGPMCFLMLTLLLVAYMVVPVYVGPPKVSTCLCRQALFPLCFTICISCIAVRSFQIVCAFKMASRFPRAYSYWVRYQGPYVSMAFITVLKMVIVVIGMLATGLSPTTRTDPDDPKITIVSCNPNYRNSLLFNTSLDLLLSVVGFSFAYMGKELPTNYNEAKFITLSMTFYFTSSVSLCTFMSAYSGVLVTIVDLLVTVLNLLAISLGYFGPKCYMILFYPERNTPAYFNSMIQGYTMRRD
MLGPAVLGLSLWALLHPGTGAPLCLSQQLRMKGDYVLGGLFPLGEAEEAGLRSRTRPSSPVCTRFSSNGLLWALAMKMAVEEINNKSDLLPGLRLGYDLFDTCSEPVVAMKPSLMFLAKAGSRDIAAYCNYTQYQPRVLAVIGPHSSELAMVTGKFFSFFLMPQVSYGASMELLSARETFPSFFRTVPSDRVQLTAAAELLQEFGWNWVAALGSDDEYGRQGLSIFSALAAARGICIAHEGLVPLPRADDSRLGKVQDVLHQVNQSSVQVVLLFASVHAAHALFNYSISSRLSPKVWVASEAWLTSDLVMGLPGMAQMGTVLGFLQRGAQLHEFPQYVKTHLALATDPAFCSALGEREQGLEEDVVGQRCPQCDCITLQNVSAGLNHHQTFSVYAAVYSVAQALHNTLQCNASGCPAQDPVKPWQLLENMYNLTFHVGGLPLRFDSSGNVDMEYDLKLWVWQGSVPRLHDVGRFNGSLRTERLKIRWHTSDNQKPVSRCSRQCQEGQVRRVKGFHSCCYDCVDCEAGSYRQNPDDIACTFCGQDEWSPERSTRCFRRRSRFLAWGEPAVLLLLLLLSLALGLVLAALGLFVHHRDSPLVQASGGPLACFGLVCLGLVCLSVLLFPGQPSPARCLAQQPLSHLPLTGCLSTLFLQAAEIFVESELPLSWADRLSGCLRGPWAWLVVLLAMLVEVALCTWYLVAFPPEVVTDWHMLPTEALVHCRTRSWVSFGLAHATNATLAFLCFLGTFLVRSQPGCYNRARGLTFAMLAYFITWVSFVPLLANVQVVLRPAVQMGALLLCVLGILAAFHLPRCYLLMRQPGLNTPEFFLGGGPGDAQGQNDGNTGNQGKHE
Structure Data
The structural data refers to the 3D structure of the protein. The thaumatin structural data we used corresponds to the aforementioned sequence data. The resulting structure is shown in the figure below.
However, not all proteins have experimentally determined 3D structures available in databases. For example, by querying the database, we found that neither T1R2 nor T1R3 has accurate 3D structure data from experimental results. Therefore, to proceed with PPI simulations, we need to employ structure prediction methods. By providing the protein sequence data, we can use software to simulate and predict the structures of T1R2 and T1R3.
After multiple comparisons, we ultimately chose AlphaFold[2] for protein structure prediction. AlphaFold, developed by Google and DeepMind, is a tool specifically designed to predict the 3D structure of proteins. It employs a deep learning model that predicts protein structures based on large datasets. In various studies, AlphaFold has demonstrated significantly higher accuracy than any other available program. Notably, AlphaFold3, released in May this year, offers substantial improvements in precision over AlphaFold2.
To obtain the most accurate PPI results, it is crucial to provide precise protein structure data. Thus, we chose AlphaFold3 for structure prediction, selecting the highest-accuracy model, \( \text{model}_0 \) , and visualized the results using PyMOL. The predicted structure is shown in the figure below.
We observed that certain regions in the predicted structure appear as extended linear segments. These segments indicate areas where the model has lower accuracy (plDDT values < 50). The structures predicted by AlphaFold typically display this ribbon-like appearance.
PPI Prediction
Now that we have the required sequence and structural data, the next step is to perform the PPI analysis. For this, we used two models: AlphaFold and HDOCK[3], and compared their results.
AlphaFold utilizes sequence data for PPI prediction, while HDOCK use structural data for PPI predictions. First, let's examine the results predicted by AlphaFold:
thaumatin | T1R2 | Distance (Å) | Interface Area (Å2) | \( \triangle_{i} \text{G} \)(kcal/mol) |
---|---|---|---|---|
ASP-21 | LYS-497 | 3.3 | 584.3 | -0.7 |
ARG-79 | CYS-517 | 3.2 | ||
TYR-99 | ASN-292 | 3.0 | ||
LYS-163 | GLN-237 | 3.7 | ||
GLY-162 | LYS-263 | 2.1 | ||
PRO-188 | ASP-262 | 2.9 | thaumatin | T1R3 | Distance (Å) | Interface Area (Å2) | \( \triangle_{i} \text{G} \)(kcal/mol) |
ARG-67 | ASP-544 | 3.2 | 442.1 | 0.1 |
LYS-139 | GLN-531 | 3.0 | ||
PRO-141 | BLN-531 | 3.0 |
Interface area in Å2, calculated as difference in total accessible surface areas of isolated and interfacing structures divided by two.
\( \triangle_{i} \text{G} \) indicates the solvation free energy gain upon formation of the interface, in kcal/M. The value is calculated as difference in total solvation energies of isolated and interfacing structures. Negative \( \triangle_{i} \text{G} \) corresponds to hydrophobic interfaces, or positive protein affinity. This value does not include the effect of satisfied hydrogen bonds and salt bridges across the interface.
Then comes the result of the HDOCK:
thaumatin | T1R2 | Distance (Å) | Interface Area (Å2) | \( \triangle_{i} \text{G} \)(kcal/mol) |
---|---|---|---|---|
ARG-79 | GLU-516 | 3.4 | 731.5 | -0.1 |
TYR-95 | ASP-711 | 3.1 | ||
THR-160 | SER-550 | 2.9 | thaumatin | T1R3 | Distance (Å) | Interface Area (Å2) | \( \triangle_{i} \text{G} \)(kcal/mol) |
GLU-42 | GLN-543 | 3.2 | 56.0 | 1.3 |
After predicting with both models, we observed that the binding sites of Thaumatin on the sweet taste receptor are generally similar, located within the central region of the receptor. By consolidating the relevant data into a table, we can see that in both PPI models, Thaumatin exhibits a closer binding state with T1R2.
Considering that the coupled receptor T1R1+T1R3 can detect umami flavor, while the coupled receptor T1R2+T1R3 is responsible for detecting sweetness, we hypothesize that T1R3 does not possess a specific function for recognizing sweetness within the sweet taste receptor.
Based on the results obtained from PPI analysis, when using the Enzyme-Linked Immunosorbent Assay (ELISA) to detect whether the sweet protein can elicit sweetness by binding to human sweet taste receptors, we decided to first validate its binding with T1R2 during the process of assessing the sweetness of the produced Thaumatin. This approach is aimed at optimizing the resources and time required for the experiment.
This simulation result provides guidance for optimizing experimental detection of the sweetness of the sweet protein.
MD Simulation
Molecular dynamics (MD) simulations rely on computational methods to model the movement of molecular and atomic systems. In this case, as a common analysis technique following PPI predictions, we employed molecular dynamics simulations to predict the motion trajectory of Thaumatin during its docking with the sweet taste receptor, as well as to assess the conformational changes before and after docking. GROMACS was used to perform the dynamics simulations.
We primarily used the CHARMM27 all-atom force field, which includes CHARMM22 with CMAP corrections for proteins. The water molecules were filled using the SPC216 model, and a salt concentration of 0.15M was applied, aligning with physiological conditions.
The results of the molecular dynamics simulations, as shown in the figure, include changes in protein conformation and hydrogen bonds. These results demonstrate that throughout the entire docking process of Thaumatin with the sweet taste receptor, hydrogen bonds were consistently present. This confirms that the binding of Thaumatin to the sweet taste receptor is a genuine interaction, supporting further analysis of the docking results and the amino acids involved in the interaction. This provides valuable guidance for identifying potential mutation sites in future directed evolution efforts.
Directed Evolution
To validate the concept of obtaining a sweeter thaumatin through directed evolution, we performed a proof of concept during the modeling phase to ensure the feasibility of this approach.
Due to the ability of thaumatin to also bind with bitter taste receptors, its use as a sweetener can result in the perception of bitterness alongside sweetness. To enhance the sweetness of the product while minimizing bitterness, we have decided to apply directed evolution to thaumatin. Upon gastric digestion, thaumatin can generate three bitter peptides with the following sequences: DAGGRQLNSGES, FNVPMDF, and WTINVEPGTKGGKIW. These peptides can bind to the human bitter taste receptor T2R16, leading to an increase in the release of HGT-1 signaling proteins, which in turn reduces the release of the pro-inflammatory cytokine IL-17A induced by Helicobacter pylori[3]. Therefore, during the mutagenesis process, we aim to avoid mutations at the sites of these bitter peptides.
We have annotated the positions of these three bitter peptides on the 3D structure of thaumatin for clarity:
Subsequently, we will identify the sites on thaumatin that are relevant to binding with the human bitter taste receptor to determine the sites requiring mutation during our directed evolution process. It is known that thaumatin can interact with the bitter receptor T2R16, so we need to conduct PPI simulations between thaumatin and T2R16. The sequence data is as follows:
Protein Name | ID | Sequence |
---|---|---|
TAS2R16 | UniProtKB/Swiss-Prot: Q9NYV7 | Show More |
MIPIQLTVFFMIIYVLESLTIIVQSSLIVAVLGREWLQVRRLMPVDMILISLGISRFCLQWASMLNNFCSYFNLNYVLCNLTITWEFFNILTFWLNSLLTVFYCIKVSSFTHHIFLWLRWRILRLFPWILLGSLMITCVTIIPSAIGNYIQIQLLTMEHLPRNSTVTDKLENFHQYQFQAHTVALVIPFILFLASTIFLMASLTKQIQHHSTGHCNPSMKARFTALRSLAVLFIVFTSYFLTILITIIGTLFDKRCWLWVWEAFVYAFILMHSTSLMLSSPTLKRILKGKC
However, there is no experimentally determined structural data for T2R16 available in databases. Therefore, as referenced before, we will proceed with structure prediction to obtain the structural data for T2R16. The predicted structure of T2R16 obtained using AlphaFold is shown in the figure below:
Here, we used AlphaFold to generate an image of the model showing the interaction between Thaumatin and T2R16. The image includes all hydrogen bonds involved in protein-protein interaction.
We rganized the information from the figure into the table below, which will also include parameters for evaluating the docking performance
thaumatin | T2R16 | Distance (Å) | Interface Area (Å2) | \( \triangle_{i} \text{G} \)(kcal/mol) | |
---|---|---|---|---|---|
Classical | GLU-89 | THR-164 | 3.4 | 683.6 | -7.0 |
GLU-89 | THR-165 | 2.8 | |||
CYS-158 | LYS-169 | 2.1 | |||
THR-190 | PRO-161 | 3.5 | |||
Non Classical | LYS-49 | SER-164 | 2.6 | ||
PHE-181 | ASN-163 | 3.4 | |||
Salt Bridge | LYS-49 | ASP-168 | 2.7 |
The protein-protein docking score between Thaumatin and T2R16 is -287.
The docking score was calculated using the knowledge-based iterative scoring functions ITScorePP or ITScorePR[4,5]. A more negative docking score indicates a more likely binding model, with typical docking scores for protein-protein/DNA/RNA complexes around -200.
The residues on Thaumatin involved in hydrogen bond formation are LYS-49, GLU-89, CYS-158, and THR-190. Among them, the hydrogen bonds formed by LYS-49, GLU-89, and CYS-158 are relatively strong. However, considering that LYS-49 is located on the bitter peptide, we have decided to exclude it from mutation
After excluding the site on the bitter peptide, LYS-49, we identified three mutation sites: GLU-89, CYS-158, and THR-190. We mutated these sites to alanine (Ala), a small, non-polar amino acid with a minimal side chain (-CH₃). Alanine is commonly used in protein mutagenesis because its simple structure allows for easier evaluation and analysis when designing mutants. In certain cases, mutating to alanine does not significantly affect the protein's three-dimensional structure.
GLU-89
First, let's take a look at the results after mutating GLU-89:
Mut89 thaumatin | T2R16 | Distance (Å) | Interface Area (Å2) | \( \triangle_{i} \text{G} \)(kcal/mol) | |
---|---|---|---|---|---|
Classical | CYS-158 | LYS-169 | 2.1 | 697.7 | -7.0 |
CYS-159 | LYS-169 | 3.2 | |||
THR-190 | PRO-161 | 3.2 | |||
Non classical | ASP-101 | ASN-163 | 3.8 | ||
PHE-181 | ASN-163 | 3.0 | |||
Salt Bridge | LYS-49 | ASP-168 | 2.3 |
The protein-protein docking score between the Mut89 thaumatin and T2R16 remains -287.
It was observed that the Mut89 mutation had little impact on the binding between Thaumatin and the bitter receptor. Therefore, we decided to discard the mutation at Mut89.
CYS-158
Next, let's examine the results after mutating CYS-158:
Mut158 thaumatin | T2R16 | Distance (Å) | Interface Area (Å2) | \( \triangle_{i} \text{G} \)(kcal/mol) | |
---|---|---|---|---|---|
Salt Bridge | LYS-46 | GLU-158 | 3.0 | 644.7 | -4.9 |
Classical | LYS-49 | ASP-168 | 2.6 | ||
GLU-89 | THR-165 | 3.5 | |||
GLU-89 | THR-165 | 2.7 | |||
ALA-158 | LYS-169 | 3.3 | |||
Non Classical | PHE-181 | ASN-163 | 3.4 |
After calculations, the protein-protein docking score between Mut158 thaumatin and T2R16 is -284.9.
It is evident that after the mutation, the docking surface area between Thaumatin-Mut158 and T2R16 decreased by 38.9 Å2, and the absolute value of \( \triangle_{i} \text{G} \) decreased by 2.1 kcal/mol. Considering that we only mutated a single site, this change is quite significant, demonstrating that mutating CYS-158 can indeed reduce the binding affinity between Thaumatin and the bitter receptor, thereby reducing the bitterness of Thaumatin.
THR-190
The protein-protein docking score between Mut190 thaumatin and T2R16 is -304.41.
Mut190 thaumatin | T2R16 | Distance (Å) | Interface Area (Å2) | \( \triangle_{i} \text{G} \)(kcal/mol) | |
---|---|---|---|---|---|
Classical | LYS-19 | ARG-40 | 3.0 | 1208.5 | -10.8 |
ASP-21 | TRP-36 | 3.1 | |||
LSY-43 | ARG-41 | 3.3 | |||
LYS-49 | ARG-124 | 2.3 | |||
TYR-183 | ARG-121 | 2.9 | |||
ASP-186 | ARG-121 | 2.8 | |||
Salt Bridge | ASP-101 | ARG-124 | 3.0 |
The protein-protein docking score between Mut190 Thaumatin and T2R16 is -304.41.
It is evident that after the mutation, the docking surface area of Thaumatin-Mut190 with T2R16 has significantly increased, and the docking sites have also undergone notable changes. Additionally, the absolute value of \( \triangle_{i} \text{G} \) increased by 3.8 kcal/mol, indicating a substantial alteration. We hypothesize that the mutation at this site has significantly impacted the structure of Thaumatin, leading to enhanced docking capabilities at other sites on T2R16, thereby shifting the docking interaction and resulting in a stronger perception of bitterness. This demonstrates that mutating THR-190 actually intensifies the bitterness of Thaumatin. Consequently, we have decided to discard the mutation at THR-190.
Result
Mutation | Docking Score | Docking Surface Area Change(Å2) | \( \triangle_{i} \text{G} \) Change(kcal/mol) | Impact on Binding Affinity | Conclusion |
---|---|---|---|---|---|
GLU-89 | -287 | No significant change | No significant change | Minimal impact | Mutation discarded |
CYS-158 | -284.9 | Decreased by 38.9 | Decreased by 2.1 | Significant reduction | Reduces binding with T2R16, reduces bitterness |
THR-190 | -304.41 | Increased by 524.9 | Increased by 3.8 | Moderate impact | Shows strong binding; further evaluation recommended |
Ultimately, we chose to mutate the CYS-158 site in Thaumatin to reduce its bitterness and optimize its flavor.
We assessed the binding of the selected CYS-158 mutated Thaumatin with the sweet taste receptor T1R2+T1R3 to determine whether the mutation would affect Thaumatin's sweetness.
Furthermore, the binding area of Thaumatin with the sweet taste receptor has increased, and the binding energy has changed from -0.6 kcal/mol to -6.3 kcal/mol. We hypothesize that this site mutation has impacted the spatial structure of Thaumatin, causing it to bind at different sites on the sweet taste receptor and significantly enhancing its binding affinity.
This result indicates that the mutation at CYS-158 not only reduces the bitterness of Thaumatin but also enhances its sweetness, thereby prolonging the retention time of sweetness in the oral cavity. This ultimately allows for the production of a sweeter sweet protein.
However, due to regulatory compliance issues, the mutated protein cannot be used in production without proper testing. Therefore, to expedite the introduction of Sweetein into the market, the current work on protein-directed evolution is limited to proof of concept and has not yet been implemented in our project. We look forward to the future, when, following thorough evaluation, the optimized Thaumatin can be launched to gain public acceptance.
[2]Abramson, J., Adler, J., Dunger, J. et al. Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature 630, 493-500 (2024).
[3]Richter, P., Sebald, K., Fischer, K., Schnieke, A., Jlilati, M., Mittermeier-Klessinger, V., & Somoza, V. (2024). Gastric digestion of the sweet-tasting plant protein thaumatin releases bitter peptides that reduce H. pylori induced pro-inflammatory IL-17A release via the TAS2R16 bitter taste receptor. Food Chemistry, 448, 139157.
[4]Huang S Y, Zou X. A knowledge-based scoring function for protein-RNA interactions derived from a statistical mechanics-based iterative method[J]. Nucleic acids research, 2014, 42(7): e55-e55.
[5]Huang S Y, Zou X. An iterative knowledge-based scoring function for protein-protein recognition[J]. Proteins: Structure, Function, and Bioinformatics, 2008, 72(2): 557-579.