Metabolic simulation

Overview


Cancer remains a leading cause of death worldwide. According to the World Health Organization, millions of people die from cancer each year. Sulforaphane is a type of isothiocyanate primarily found in broccoli and other cruciferous vegetables, especially in broccoli sprouts, where its concentration can reach 10022.3 μg/g DW.[1] As one of the highly bioactive natural compounds in cruciferous plants, sulforaphane has anticancer, antioxidant, and antiviral properties. It was discovered to have chemical protective effects in 1992 and has since become a research hotspot both domestically and internationally. One of the challenges our team tackling is how to use synthetic biology to produce sulforaphane on a large scale, with the first step being to provide feasible synthetic pathways. In synthetic biology, synthetic pathways are mainly derived from the modification of natural pathways and the design of new pathways. Based on our experimental team's foundation, we are exploring the possibility of new pathways. We use a biosynthetic reverse engineering approach to model synthetic pathways, which includes predicting and screening reverse biosynthetic pathways.

In the pathway prediction step, we used various biological databases to collect all potentially synthetic pathways. In the pathway screening phase, we conducted multi-criteria screening on all the collected pathways from the previous step to obtain new pathways with experimental feasibility.

Introduction


Biological retrosynthesis analysis includes the following main steps: pathway prediction and pathway screening.

  • Pathway prediction:
    1. How to achieve pathway simulation.
  • Pathway screening:
    1. Molecular thermodynamic feasibility.
    2. Enzyme kinetic feasibility.
    3. Consideration of pathway length.

Pathway Prediction


Pathway prediction is based on reaction pathway extension from a reaction database. First, the compound node corresponding to the target product in the reaction network is located in the reaction database, and then a graph search is performed in the reverse direction of the directed edges, thereby achieving pathway prediction.

Steps to achieve pathway simulation

Based on the idea of retrosynthesis, we determined sulforaphane as the final synthetic product. Among the three retrosynthetic pathway design methods, we chose the search based on existing pathways.[3] This method extracts possible biosynthetic pathways from existing reaction databases (such as MetaCyc and KEGG) and ranks the pathways based on experience. For complex compounds, when there is no synthetic pathway available in the database, these methods are usually not applicable. Fortunately, there are many recorded reactions related to sulforaphane in the database.

Fig 1.Design Methods of Retrosynthesis Pathways.
  1. For the search of existing pathways, all potential pathways from methionine to sulforaphane can be elucidated using the Rxnfinder software (www.rxnfinder.org). Rxnfinder is an integrated website developed by Wuhan LifeSynther Co. It is a world-leading provider of big data for biosynthesis, AI-assisted biosynthetic design, and one-stop biosynthesis services. Through this website, we easily collected all potential synthetic pathways from thousands of literature sources.
  2. In order to accurately simulate the synthetic pathway, among the pathways from methionine to sulforaphane, we further identified several good intermediate products

In the integrated database website Rxnfinder, a big data search identified the intermediate products shown in the figure: 3'-phospho-5'-adenylyl sulfate, homomethionine, and dihomomethionine. Building on this step, we further explored possible secondary intermediates such as L-aspartic acid, alpha-amino acid ester, L-cysteine, pyruvic acid, succinyl-CoA, 4-methylthio-2-oxobutanoic acid, oxaloacetic acid, alpha-amino acid ester, and pyruvic acid.

After comprehensive calculations, 73 possible synthetic pathways from methionine to sulforaphane were identified in the known databases. With this, we completed the pathway prediction.

Fig 2.The pathway from methionine to sulforaphane from the Rxnfinder database.

Pathway screenning


In the first step, the generated metabolic network will include all possible reactants and enzymes required to produce the target product. However, due to the random breaking of bonds based on molecular structures, the output may include candidate precursors or chemical reactions that do not exist in nature or are not part of biological processes. Therefore, it is necessary to refine the existing data first. After removing such simulated pathways, 65 potential synthesis routes remain. However, these remaining pathways are still complex and disorganized, presenting the following two problems:

  1. the generated pathways do not conform to the underlying logic of synthetic biology
  2. the large number of enzyme types is incomplete and difficult to statistically analyze.

Relying solely on experimental synthesis of each pathway would greatly increase the workload and reduce efficiency. Therefore, to identify the most promising synthesis routes among the numerous possibilities, it is necessary to score and prioritize the pathways, which involves pathway screening.

Considering the time cost of evaluating all pathways, it is more practical to test the feasibility of two example pathways first. This approach will facilitate the establishment of a pathway evaluation system and improve the completion of the task. Additionally, we attempt to simplify the calculation steps by assuming the necessary intermediate products and limiting their number. Moreover, considering the logical and reasonable requirements for the synthesis pathway of brassinolide, it must follow these three stages:[3]

  1. Chain Elongation
  2. Formation of the Core Structure
  3. Secondary Modification

Compared to the chain elongation stage, the latter two synthesis stages—core structure formation and secondary modification—have fewer alternative steps. [4]Therefore, in the synthesis pathway simulation, we mainly focus on simulating the chain elongation pathways. We select two different synthesis routes with 2-Oxo-6-methylthiohexanoic acid and 5-methylthiopentanaldoxime as required intermediates as examples, aiming to develop a pathway evaluation system.

Generally, there are many criteria for pathway scoring. In pathway screening, it is essential to first exclude theoretically infeasible pathways using quantitative metrics such as substrate similarity, thermodynamic feasibility, enzyme sequences, and pathway length. After comprehensive consideration, we have decided to use the following scoring criteria to estimate the quality of synthesis pathways:

  1. Thermodynamic Feasibility: Evaluates whether the reactions within the pathway are thermodynamically favorable.
  2. Enzyme Kinetic Feasibility: Assesses the feasibility of the enzyme kinetics involved in the pathway.
  3. Pathway Length: Considers the number of steps or reactions in the pathway, with shorter pathways often being preferred for practical synthesis.

This approach allows for an initial evaluation of the pathways based on theoretical criteria , pending further validation through experimental data.At this stage, the work aims to provide preliminary predictions.

We have designated the two pathways as Pathway 1 and Pathway 2, and the key enzymes to be considered are listed in the table below:

Number Intermediate product Number of the Enzyme
1 Dihomomethionine -2-Oxo-6-methylthiohexanoic acid -4-methylthio-2-oxobutanoic acid EC.2.6.1. & EC.2.6.1.88
2 Dihomomethionine- 5-methylthiopentanaldoxime- S- (4-Methylthiobutylthiohydroximoyl)-L-cysteine EC. 1.14.14.42 & EC. 1.14.14.43

Thermodynamic Feasibility

The Gibbs free energy change (ΔG) represents the change in the thermodynamic potential of a reaction and determines the direction and efficiency of enzyme-catalyzed reactions. It is an important measure for assessing the thermodynamic feasibility of predicted pathways and evaluating the thermodynamic driving force of biosynthetic routes. Some biocatalysis prediction tools use Gibbs free energy data from databases or thermodynamic calculation tools to evaluate and filter pathways based on their thermodynamic feasibility.[5]

According to the Gibbs free energy formula:

$dG = dH-TdS$

and its variant forms:

$\Delta G=\Delta G^\circ+RTln\cfrac{Q}{K}$
  • $\Delta G$=Gibbs free energy change under current conditions
  • $\Delta G^\circ$=Standard Gibbs free energy change
  • $R$=Universal gas constant (8.314J/(mol·K))
  • $T$==Temperature in Kelvin (K)
  • $K$=Equilibrium constant of the reaction
  • $Q$=Reaction quotient (ratio of the concentrations of products to reactants)

Assuming all reactions occur under standard conditions (1 M concentration, 25°C), the Gibbs free energy of the reaction equals the standard free energy of the reaction. It’s $\Delta G=\Delta G^\circ$;

The calculated Gibbs free energies for the two pathways under study are as follows:

Number Standard Gibbs Free Energy (ΔrG'°)
1 -41.084457 kcal/mol
2 -184.48651 kcal/mol
-89.83178 kcal/mol

From the above, it is clear that both pathways meet the thermodynamic feasibility criteria.

Enzyme Kinetic Feasibility

To assess enzyme kinetics and determine feasibility, the Michaelis-Menten model is commonly used, which describes the rate of enzyme-catalyzed reactions. The steps are outlined as follows:

  • To determine enzyme kinetics, we need to record the following parameters: The rate of product formation V or substrate consumption at each substrate concentration [S].
  • Substitute these values into the Michaelis-Menten equation:
  • $V=\cfrac{V_{max}[S]}{K_{m}+[S]}$
  • $V$ is the reaction rate.
  • $V_{max}$is the maximum reaction rate.
  • $[S]$is the substrate concentration.)
  • $K_{m}$is the Michaelis constant.

Enzyme kinetic feasibility depends on experimental data, which means we need to further test and analyze the simulated pathways.

Path Length

Path length represents the number of individual reactions in each pathway needed to complete the specified synthesis transformation. It is the most direct screening criterion. Since each reaction step requires enzyme catalysis, a longer pathway implies more enzymes, increasing both cost and the metabolic burden on the host. Additionally, a longer pathway means that the starting substrate needs to be modified multiple times, raising the likelihood of by-product formation and resulting in lower selectivity and theoretical yield. [5]Therefore, a shorter path length is preferable.

Based on this, the calculation method for the path length score len is:

len=$\cfrac{1}{n}$

Here, $n$ represents the length of the independent pathway. The longer the path, the lower its path length score. During evaluation, a higher path length score is preferred.

For pathway analysis, after satisfying the thermodynamic and enzyme kinetic feasibility, the pathway can be evaluated using enzyme kinetic coefficients and path length coefficients as criteria.

Result


By analyzing the feasibility of two selected pathways and comparing their advantages and disadvantages, we have developed a simple pathway evaluation scheme. This scheme uses thermodynamic Gibbs free energy, enzyme kinetic coefficients, and path length to build a pathway assessment system. The steps are as follows:

  1. Analyze the total Gibbs free energy to determine if the reaction can occur spontaneously.
  2. Quantitatively calculate the enzyme kinetic coefficients for key reactions, assessing the catalytic capability of the enzymes based on these coefficients.
  3. Evaluate the overall impact of path length on the entire simulated pathway.

This simplified Scheme will provide a feasible evaluation approach and convenience for subsequent work. Our team can use this model to complete the feasibility analysis and comparison of remaining pathways.

Optimization of the Simple Pathway Evaluation Scheme


Although hundreds or thousands of pathways can be derived from major databases, making the scheme more practical is a significant challenge. The following three points outline the optimization policies for the scheme:

1. Adopt More Rigorous Pathway Simulation Methods:

Initially, we used a simple method of collecting data from websites. Although this yielded over 70 potential synthesis pathways, the data collection was manual and resulted in cluttered and imprecise information. To enhance accuracy and reduce complexity in future improvements, we should develop an algorithm for pathway collection. This will significantly improve the convenience and efficiency of pathway simulations.

2. Calculate Gibbs Free Energy for Entire Pathways:

Large and complex biochemical reactions involve multiple substrates, products, and intermediates. Calculating ΔG° for these reactions requires precise measurement of free energy changes at each step. However, detailed information about the standard Gibbs free energy for each reaction step is often unavailable. To address this challenge, the current work estimates Gibbs free energy for key steps only, providing an approximate result. Overcoming this challenge typically involves combining experimental measurements, computational modeling, and developing more accurate predictive tools to better estimate the standard Gibbs free energy for the entire pathway.

3. Collect Enzyme Kinetic Data for Pathway Comparison:

Currently, there is insufficient data to support enzyme kinetic calculations, affecting the accuracy of key parameters such as Km and Vmax, and hindering enzyme kinetic coefficient comparisons. This necessitates further extensive work to validate pathway efficacy. To reduce this workload, we should also explore using computational techniques to simulate the entire biochemical reaction process, combining results to identify practical and efficient synthesis pathways.

In summary, key issues in pathway simulation require ongoing improvement efforts. Firstly, the accuracy of biosynthetic pathways is often limited by our understanding and control of complex biochemical networks. Therefore, new data processing methods should be integrated with existing metabolic network algorithms to build a more comprehensive simulation. [6]Secondly, accurately calculating the standard Gibbs free energy for entire pathways is complex and uncertain, impacting pathway prediction and optimization. To address this, our team should shift from quantitative precision to qualitative assessment, ensuring that the total Gibbs free energy for simulated pathways meets the condition ΔrG < 0. Overcoming these challenges will require advances in computational methods, refined experimental techniques, and a deeper understanding of biochemical kinetics to enhance the reliability and effectiveness of biosynthesis in various applications.

Reference👈

  1. Shuqiong Xie, Jun He, Jun Hed. Study on the Content of Raphanus Sativus (Radish) Sulforaphane in Nine Brassicaceae Vegetables [J]. Journal of Anhui Agricultural University,2013:122-125.
  2. Yixin Wei, Yilei Han, Diannan Lu, Tong Qiu. A method for assessing biosynthetic pathways based on theoretical feasibility [J]. Journal of Tsinghua University (Natural Science Edition),2022:7.
  3. WFufeng Liu, Xuzhi Liu, Jinbi Li, Fuping Lu. Optimize metabolic pathways using bioretrosynthesis tools [J]. Advances in Chemistry,2024,36(4): 501~510.
  4. Han Yang, Feixia Liu, Yin Li, Bo Yu. Reconstructing Biosynthetic Pathway of the Plant-Derived Cancer Chemopreventive-Precursor Glucoraphanin in Escherichia coli[J]. ACS Synth. Biol. 2018, 7, 121−131
  5. Zhen Zhang, Xuecheng Zeng, Lei Qin, Li Chun. Advances in intelligent design of microbial cell factories[J]. CIESC Journal. 2021, 72(12): 6093-6108
  6. Rizhao Zhang. Metabolic network models simplify algorithm development and guide pathway design [D].Tianjin: Tianjin University of Science and Technology.2020.
Back to Top