D-Tagatose is a functional sugar with multiple health benefits, including lowering blood sugar levels, preventing dental caries, improving gut microbiota, promoting blood circulation, and exhibiting anti-aging effects[1-3]. However, large-scale production of D-tagatose remains limited by its high production costs. Since fructose is an inexpensive and widely available isomer of D-tagatose, using D-fructose as a substrate for producing D-tagatose via D-tagatose-4-epimerase catalysis is an attractive alternative.[4]. Currently, existing D-tagatose-4-epimerases show generally low activity and conversion rates in the catalysis of fructose to D-tagatose[5]. Therefore, the discovery of D-tagatose-4-epimerases with high activity and conversion efficiency is of significant importance for both scientific research and industrial applications.
In this study, we employed bioinformatics approaches to identify four newly discovered proteins with potential D-tagatose-4-epimerase activity.
Using the amino acid sequence UxaE from Thermotoga petrophila RKU-1 as a template[11], we conducted a BLAST search against the non-redundant protein sequence database at the National Center for Biotechnology Information (NCBI). Of the 100 sequences retrieved, 80 with a similarity threshold below 70% were selected for further analysis. Multiple sequence alignment was performed using MEGA software, and a phylogenetic tree was constructed using the neighbor-joining method (see Figure 1).
Figure 1 The phylogenetic tree constructed from protein sequences obtained through BLAST search
In this study, we used AlphaFold2 to perform structural modeling on the 80 selected sequences. AlphaFold2 is a state-of-the-art deep learning-based tool that can efficiently and accurately predict protein structures based on their amino acid sequences[6].
Figure 2 AlphaFold2 Structural Modeling (Using AJC7 as an Example)
Following the protein structure modeling using AlphaFold2, the D-fructose structure file was downloaded from the PubChem database. In Discovery Studio 2019, we first opened the three-dimensional structure of the AJC7 protein. Then, in the tool browser, we selected "Receptor-Ligand Interactions" and clicked on "Define and Edit Binding Site" to define AJC7 as the receptor molecule. Next, we clicked "File > Open" to import the D-fructose structure file downloaded from PubChem, and in the tool browser, we expanded "Prepare or Filter Ligands > Prepare Ligands" under "Small Molecules" to open the corresponding process parameter settings panel. After setting the "Input Ligands" parameters, we clicked "Run" to process the small molecule. Subsequently, we used the C-DOCKER module (a docking tool based on the CHARMM force field) to dock D-fructose into the active site of AJC7. Candidate conformations were generated through random rigid-body rotation and simulated annealing, and the structure of the protein-ligand complex underwent energy minimization using the CHARMM force field. Finally, we retrieved the lowest energy docking conformations using the C-DOCKER module and selected the substrate orientation with the lowest interaction energy with the ligand for further analysis. The conformations were ranked based on CHARMM energy, and the highest-scoring conformations were retained for further study. The docking results revealed 13 residues with potential interactions with D-fructose: Tyr441, Tyr404, His162, Gly344, Thr366, Asp161, Gln123, Ser125, Glu128, Arg127, Lys309, His342, and Arg100. Based on previous studies, we hypothesize that Asp161 and Glu128 serve as catalytic residues, with Asp161 acting as a nucleophile and Glu128 functioning as a proton donor.
Figure 3 Docking model of D-fructose in AJC7
We calculated the binding free energy of the receptor-ligand complexes using the CHARMm-based energy function and an implicit solvent model. The binding energy between the receptor and ligand (ΔEBinding) is defined as EComplex = ELigand - EReceptor. To estimate these free energies, we minimized the ligand energy in the presence of the receptor using the steepest descent and conjugate gradient methods. The effective Born radii were computed using the Generalized Born Simple Switching (GBSW) implicit solvent model, replacing the costly molecular surface approximation with a smooth dielectric boundary combined with a van der Waals surface.
Using this method, we calculated the binding free energy between the selected sequences and fructose.
Figure 4 Binding free energy between the selected sequence and fructose
Considering the binding free energy between the proteins and fructose, the branches and origins in the phylogenetic tree, we selected four unknown proteins with potential D-tagatose-4-epimerase activity.
These proteins have been named MBC, AJC7, TET, and HDM. They exhibit promising binding activity and solubility, making them suitable candidates for further experimental validation and potential industrial applications. After experimental validation, we chose AJC7, which showed the best performance, for protein engineering modification.
In order to improve the solubility of the AJC7 sequence with high enzyme activity identified in wet-lab experiments, we utilized the LSTM model from the study by Chao Wang et al. to predict its solubility after adding three solubility-enhancing tags: MBP, NusA, and trxA[10]. This model integrates both physicochemical properties and distributed representation information from protein sequences to predict protein solubility.
Figure 5. Protein Solubility Prediction Results After Adding Solubility-Enhancing Tags
Rational design is a strategic approach involving targeted modifications of protein structure to enhance or introduce new functions. In this study, we applied rational design principles to optimize the selected protein AJC7, enhancing its D-tagatose-4-epimerase activity and stability.
Amino acid mutations were carried out using PyMOL, while the C-DOCKER module, in conjunction with binding energy calculation tools, was utilized in Discovery Studio to evaluate the binding energy between the mutated proteins and fructose.By combining docking analysis with known catalytic mechanisms of D-tagatose-4-epimerase and successful mutation cases from other studies, five potential mutation sites were identified that could enhance enzymatic activity.The selection of these mutation sites provides an important theoretical foundation and practical basis for subsequent enzyme optimization.
Figure 6. Binding Free Energy Between Mutants and Fructose (Using AJC7 as an Example)[9]
In this phase of rational design, we focused on single-point mutations to improve the activity and stability of the selected proteins. Single-point mutations involve altering a single amino acid residue within the protein sequence, which can significantly impact protein function.
Table 1. Reasons for Mutation Site Selection
Mutation Site | Mechanism Analysis |
S125D | The first step in the conversion of D-fructose to D-tagatose produces a glyceraldehyde intermediate. When serine is mutated to aspartic acid, the carboxyl group of aspartic acid can interact with the terminal aldehyde group of the glyceraldehyde intermediate, promoting the protonation of the aldehyde and thus facilitating the catalysis and production of D-tagatose. Furthermore, the negatively charged aspartic acid may enhance interactions with the positively charged substrate and alter the charge distribution in the binding pocket, thereby improving the catalytic activity of the enzyme. |
T181A | After the mutation of serine to alanine, the side chain volume of the amino acid decreases, which may reduce steric hindrance outside the active site of the enzyme, potentially enhancing the binding or interaction between the enzyme and the substrate. Additionally, mutating polar amino acids to nonpolar amino acids enhances the hydrophobicity within the protein, thereby increasing the stability of the enzyme. |
H342L | After the mutation of histidine to leucine, the charge characteristics of the amino acid residue were altered, reconstructing the charge distribution in the substrate-binding pocket, which enhanced the interaction with the substrate. Furthermore, this mutation not only increased the volume of the active pocket but also improved its hydrophobicity, significantly enhancing the catalytic activity and stability of the enzyme. |
I129T | After replacing the nonpolar amino acid isoleucine with the polar amino acid threonine, the hydroxyl group in the threonine side chain can form a new hydrogen bond interaction with serine at position 125. This enhanced interaction may increase the enzyme's stability and alter its conformation, thereby creating a more favorable environment in the active pocket for substrate binding through a transmission effect, ultimately improving catalytic efficiency. |
L140P | Proline features a cyclic structure, and its unique secondary amine configuration limits the rotational freedom of this residue, resulting in changes to the enzyme's three-dimensional structure. This alteration may influence the interactions between the active pocket and the substrate through a transmission effect, thereby enhancing the enzyme's catalytic activity and stability. |
Figure 7. (A)Docking model of D-fructose in the H342I mutant,(B)Docking model of D-fructose in the I129T mutant,(C)Docking model of D-fructose in the L140T mutant,(D)Docking model of D-fructose in the S125D mutant,(E)Docking model of D-fructose in the T181A mutant.
Using the analysis of mutation sites, we constructed a total of 14 mutants, including S125D, T181A, H342L, I129T, L140P, and combinations such as S125D/ T181A, S125D/ H342L, S125D/ I129T, S125D/ L140P, S125D/ T181A/ I129T, S125D/ T181A/ L140P, S125D/ T181A/ H342L, S125D/ T181A/ I129T/ L140P, and S125D/ T181A/ I129T/ L140P/ H342L. Below is the interpretation of the experimental results for these multi-point mutations.
In the combination of multiple mutations, the S125D/T181A double mutation showed a significant enhancement in enzyme activity, with catalytic efficiency doubling compared to the wild-type enzyme. We believe this is due to the carboxyl group introduced by S125D, which strengthens the interaction with the terminal aldehyde group of the glyceraldehyde intermediate. At the same time, T181A reduces steric hindrance in the enzyme's active pocket and increases the hydrophobicity of the active site, leading to a structural rearrangement of the active pocket, thereby facilitating better substrate entry. The synergistic effect of these two mutations effectively enhances the overall catalytic capacity of the enzyme.
Figure 8. Docking model of D-fructose in the S125D/T181A mutant
In the S125D/H342L double mutation, the aspartic acid introduced by S125D can interact with the terminal aldehyde group of the glyceraldehyde intermediate, while H342L replaces the positively charged histidine with a neutral and more hydrophobic leucine, altering the charge characteristics of the amino acid residues and increasing the hydrophobicity of the protein, thereby facilitating substrate binding. Although this combination improves the substrate binding ability of the active pocket to some extent, experimental results from single-point mutations suggest that the effect of H342L is relatively weak. Therefore, the overall enhancement is not as significant as that of the S125D/T181A mutation.
Figure 9. Docking model of D-fructose in the S125D/H342L mutant
In the S125D/I129T mutation, S125D enhances the interaction with the glyceraldehyde intermediate through the introduction of a negatively charged carboxyl group. In the single I129T mutation, the hydroxyl group in the threonine side chain can form a new hydrogen bond interaction with serine at position 125. This enhanced interaction improves enzyme stability and alters its conformation, creating a more favorable environment in the active pocket for substrate binding through a transmission effect, thus increasing catalytic efficiency. However, due to the potential interaction between I129T and the polar aspartic acid at position 125, which may create an environment in the active pocket that is less favorable for substrate binding, the overall improvement from this combination is moderate.
Figure 10. Docking model of D-fructose in the S125D/I129T mutant
In the S125D/L140Pouble mutation, the cyclic structure introduced by the L140P mutation increases the rigidity of the protein's tertiary structure, which may lead to overall structural instability. This, in turn, weakens the interaction between the aspartic acid introduced by S125D and the glyceraldehyde intermediate, resulting in a reduced overall enhancement of enzyme activity compared to the S125D single-point mutation.
Figure 11. Docking model of D-fructose in the S125D/L140P mutant
The combination of S125D/T181A/I129Tfurther enhances the catalytic efficiency of the enzyme. The negative charge introduced by S125D and its role in intermediate rotation, the increased hydrophobicity and reduced steric hindrance caused by T181A in the active site, along with the polarity introduced by I129T, synergistically improve the enzyme's binding affinity and catalytic efficiency with fructose. This combination of three mutations has been proven to be the most effective method for improving enzyme performance.
Figure 12. Docking model of D-fructose in the S125D/T181A/I129T mutant
In the S125D/T181A/I129T/L140Pquadruple mutation, the introduction of L140P retains some of the positive effects observed in the S125D/T181A/I129T triple mutation. However, the proline introduced by L140P increases the rigidity of the protein while potentially disrupting the α-helix structure, reducing the overall structural stability of the protein. This instability may hinder the stable binding of the enzyme to the substrate, resulting in a less significant improvement in catalytic activity than expected.
Figure 13. Docking model of D-fructose in the S125D/T181A/I129T/L140P mutant
TheS125D/T181A/I129T/L140P/H342L quintuple mutation incorporates all the key mutations previously discussed. However, the introduction of H342L may have excessively altered the internal structure of the active pocket, making it difficult for the substrate to bind stably. Additionally, while the proline introduced by L140P increases the rigidity of the protein, it may also disrupt the α-helix, reducing the overall structural stability of the protein. Therefore, this combination is less effective in enhancing enzyme activity compared to the S125D/T181A/I129T triple mutation.
Figure 14. Spatial Relationship Between S125D/T181A/I129T/L140P/H342L Mutation Sites and the Substrate
These findings suggest that while combined mutations can enhance enzymatic activity, structural changes—such as those introduced by the L140P mutation—may adversely affect catalytic outcomes. Among the mutations tested experimentally, the triple mutation S125D/T181A/I129T proved to be the most effective, clearly demonstrating a synergistic effect between substrate binding and catalytic efficiency.
Using the protein structure prediction tool AlphaFold2, we simulated the structure of tagatose-4-epimerase, obtaining three-dimensional models for the wild type and the mutants S125D, S125D/T181A, and S125D/T181A/I129T. Molecular dynamics simulations were performed on the selected protein using the Gromacs 2023.2 package and the GROMOS96 53a6 force field. The protein was then solvated in a cubic water box using the SPC water model, and Na+ or Cl− ions were used to replace random water molecules to neutralize the system. Following this, energy minimization was carried out, followed by 1 ns (500,000 steps) of NVT equilibration and 1 ns of NPT equilibration (1 bar). Finally, MD simulations were conducted for 100 ns (5,000,000 steps) at 300 K.
Molecular dynamics simulation trajectories depicted the conformational changes of the protein throughout the simulation, providing an approach to assess the structural stability of both the wild-type and the mutants (S125D, S125D/T181A, and S125D/T181A/I129T).As shown in Figure 14, the RMSD values for the mutants S125D, S125D/T181A, and S125D/T181A/I129T consistently decrease during the entire simulation period.In contrast, the wild-type exhibited relatively higher RMSD values, indicating that protein stability progressively improved with multiple rounds of mutation.
Figure 15. Root Mean Square Deviation (RMSD) in Molecular Dynamics Simulations
To investigate the mechanism behind the enhanced enzymatic activity of tagatose-4-epimerase and its mutants, we focused on analyzing eight key amino acid residues that interact with D-fructose: His162, Asp161, Gln123, Ser125, Arg127, Lys309, Glu128, and Arg100. Based on previous studies, it is hypothesized that during the conversion of D-fructose to D-tagatose, Asp161 and Glu128 act as catalytic residues, with Asp161 serving as a nucleophile and Glu128 functioning as a proton donor
Figure 16. RMSF analysis of amino acid residues related to wild-type and mutant forms
The RMSF analysis of the amino acid residues that directly or indirectly interact with D-fructose (Figure 15) showed that, compared to the wild type, the fluctuations of most amino acid residues in the mutants S125D, S125D/T181, and S125D/T181A/I129T progressively decreased as the mutation sites accumulated. Notably, the fluctuations of the two catalytic residues were significantly reduced, indicating an improvement in the overall stability of the protein. These results are consistent with our experimental data, confirming that the AJC7 protein exhibits enhanced stability and activity after the combined mutations.
[1] Paterna, J.; Boess, F.; Stäubli, A.; Boelsterli, U. Antioxidant and Cytoprotective Properties of D-D-tagatose in Cultured Murine Hepatocytes. Toxicol. Appl. Pharmacol. 1998, 148, 117−125.
[2] Lu, Y.; Levin, G. V.; Donner, T. W. D-tagatose, a New Antidiabetic and Obesity Control Drug. Diabetes, Obes. Metab.2007, 10, 109−134.
[3] Espinosa, I.; Fogelfeld, L. D-tagatose: From a Sweetener to a New Diabetic Medication? Expert Opin. Invest. Drugs 2010, 19, 285−294.
[4] QI X, TESTER R F. Fructose, galactose and glucose – In health and disease[J/OL]. Clinical Nutrition ESPEN, 2019, 33: 18-28. http://dx.doi.org/10.1016/j.clnesp.2019.07.004. DOI:10.1016/j.clnesp.2019.07.004.
[5] RODIONOVA I A, SCOTT D A, GRISHIN N V, et al. Tagaturonate-fructuronate epimerase UxaE, a novel enzyme in the hexuronate catabolic network inThermotoga maritima[J/OL]. Environmental Microbiology, 2012, 14(11): 2920-2934. http://dx.doi.org/10.1111/j.1462-2920.2012.02856.x. DOI:10.1111/j.1462-2920.2012.02856.x.
[6] Jumper, J., Evans, R., Pritzel, A. et al. Highly accurate protein structure prediction with AlphaFold. Nature 596, 583–589 (2021). https://doi.org/10.1038/s41586-021-03819-2
[7] VANOMMESLAEGHE K, HATCHER E, ACHARYA C, et al. CHARMM general force field: A force field for drug-like molecules compatible with the CHARMM all-atom additive biological force fields[J/OL]. Journal of Computational Chemistry, 2010: 671-690. http://dx.doi.org/10.1002/jcc.21367. DOI:10.1002/jcc.21367.
[8] RODIONOVA I A, SCOTT D A, GRISHIN N V, et al. Tagaturonate-fructuronate epimerase UxaE, a novel enzyme in the hexuronate catabolic network inThermotoga maritima[J/OL]. Environmental Microbiology, 2012: 2920-2934. http://dx.doi.org/10.1111/j.1462-2920.2012.02856.x. DOI:10.1111/j.1462-2920.2012.02856.x.
[9] The CNSknowall platform (https://cnsknowall.com), a comprehensive web service for data analysis and visualization, was utilized
[10] Chao Wang et al., DeepSoluE: A LSTM model for protein solubility prediction using sequence physicochemical patterns and distributed representation information, BMC biology, 2023: 21(12): https://doi.org/10.1186/s12915-023-01510-8
[11] SHIN K C, LEE T E, SEO M J, et al. Development of Tagaturonate 3-Epimerase into Tagatose 4-Epimerase with a Biocatalytic Route from Fructose to Tagatose[J/OL]. ACS Catalysis, 2020: 12212-12222. http://dx.doi.org/10.1021/acscatal.0c02922. DOI:10.1021/acscatal.0c02922.