| CHINA-HUBU-WUHAN

Overview

The "One for All——Synbiotic Therapy" project aims to leverage synthetic biology technology for the design and modification of the biosafety strain Zymomonas mobilis, with the objective of developing a healthcare solution that integrates preventive care and personalized treatment. By employing techniques such as metabolic engineering, gene editing, and rational protein design, we have successfully engineered highly efficient 3-HB production strains specifically tailored for treating diseases like diabetes and colitis. Moreover, these strains are capable of producing functional products such as fructooligosaccharides and oligosaccharides that can aid in blood pressure reduction or serve as prebiotics. Additionally, we have introduced expression plasmids containing GLP-1 and ACEi into Zymomonas mobilis to enable the expression of therapeutic peptides with glucose-lowering and blood pressure-reducing effects.

Hydroxybutyrate (3-HB), a principal component of ketone bodies produced by the liver, functions not only as an energy carrier but also as a signaling molecule in cellular processes. 3-HB significantly improves the metabolic state of type 2 diabetic (T2D) mice by engaging in a series of signaling pathways through its interaction with hydroxycarboxylate receptor 2 (HCAR2). The metabolic effects of 3-HB on T2D mice are particularly notable [1].

TesB is a thioesterase derived from Escherichia coli (strain K12), with broad substrate specificity that primarily catalyzes the hydrolysis of medium- and long-chain acyl-CoA substrates into free fatty acids and CoA [2]. This enzyme plays a crucial role in the thioesterase-dependent β-oxidation pathway of oleic acid and conjugated linoleic acid ((9Z,11E)-octadecadienoic acid or CLA), which supplies the energy and carbon precursors necessary for the growth of E. coli [3]. The gene encoding TesB enables the production of (R)-3-hydroxybutyrate and (S)-3-hydroxybutyrate (3-HB) with high enantiomeric purity.

GLP-1, an endogenous glucose-regulating peptide, has been shown to efficiently control blood glucose levels and enhance pancreatic β-cell function[4]. In 2023, the journal Science recognized the GLP-1 class of drugs as one of the top 10 scientific breakthroughs, highlighting its significant role in the treatment of diabetes and obesity. GLP-1 binds to the extracellular ligand-binding domain of the GLP-1 receptor (GLP-1R). This binding process relies on high complementarity between GLP-1 and GLP-1R, involving various intermolecular forces, including electrostatic interactions, hydrogen bonds, hydrophobic interactions, and van der Waals forces. Upon binding, the extracellular domain of GLP-1R undergoes a conformational change, inducing structural adjustments in the transmembrane region. This leads to the coupling of GLP-1R to G proteins (typically Gαs proteins), which activates adenylate cyclase (AC) and elevates intracellular cyclic adenosine monophosphate (cAMP) levels. Subsequently, cAMP serves as a second messenger to activate protein kinase A (PKA) and exchange protein directly activated by cAMP 2 (Epac-2). PKA and Epac-2 depolarize the cell membrane by closing ATP-sensitive potassium channels and activating voltage-dependent calcium channels, resulting in calcium influx and generation of an action potential. This signaling cascade stimulates insulin secretion, suppresses glucagon release, and regulates energy homeostasis [5].

ACEi is an angiotensin-converting enzyme inhibitor derived from goat's milk casein, consisting of 11 amino acid residues with the sequence LVYPFTGPIPN [6]. ACE is a key enzyme in the renin-angiotensin system (RAAS) that converts inactive angiotensin I into the potent vasoconstrictor angiotensin II, which elevates blood pressure. By inhibiting ACE activity, ACEi reduces angiotensin II production, leading to vasodilation and lowered blood pressure [6-7].

Purpose

In our modeling approach, we aimed to deeply explore the interactions and functional properties of mGLP-1, ACEi, 3-HB, and their related molecules using advanced computational tools such as AlphaFold3, Avogadro, and H-DOCK. We identified mutation sites for the enzymatic modification of TesB through stability design based on a hybrid approach combining sequence and structural analysis. We investigated the growth dynamics of Zymomonas mobilis in the intestinal environment and their impact on GLP-1 expression by constructing a growth model simulating the intestinal pH environment.

AlphaFold3 was employed for the structural prediction of GLP-1 and ACEi, while Avogadro was utilized for structural modeling of 3-HB. Molecular docking studies were conducted using H-DOCK to examine the interaction between mGLP-1 and DPP-4, as well as GLP-1R, evaluating the potential to prolong the half-life of mGLP-1 without compromising its binding affinity for GLP-1R. H-DOCK was also used to assess the molecular docking of ACEi with ACE and 3-HB with HCAR2, focusing on ligand-receptor interactions.
We aimed to develop an enzyme modification strategy to rapidly identify key amino acid mutations that enhance catalytic efficiency and improve enzyme performance. This was achieved by analyzing and comparing existing mainstream site selection methods using molecular docking, kinetic simulation software, and enzyme modification of TesB to increase 3-HB yield.
We also sought to construct a growth model for the gut pH environment tailored to the strain Zymomonas mobilis, to visualize bacterial growth and predict GLP-1 production in the simulated environment. Additionally, we explored the use of Particle Swarm Optimization (PSO) and evolutionary computational algorithms to optimize the amount of plasmid transfection and the initial dosage of probiotics, aiming to maximize the therapeutic efficacy of GLP-1 expression.

Protein structure prediction

The three-dimensional architecture of a protein serves as the foundation for its functionality, dictating its molecular interactions, subcellular localization, and specific biological activities. In this study segment, we employed Alphafold3 to predict the structural conformation of GLP-1 and ACEi.

Protein structure prediction

The human-derived GLP-1 (7-36) was modified by substituting alanine at position 8 with serine, and replacing the two lysines at positions 26 and 34 with glutamate and aspartate, respectively. These modifications were made to enhance resistance against degradation by DPP-4 and trypsin enzymes. This modified version is referred to as mGLP-1[8].

Table 1 Peptide sequences

The protein structure prediction was performed using Alphafold3, developed by DeppMing. Subsequently, the obtained prediction was carefully analyzed and the most accurate result was downloaded for further analysis. Finally, the protein structure was visualized utilizing Pymol (Figure 1).

Figure 1 Alphafold3-based structural prediction of mGLP-1 protein (A) Alphafold3-based ACEi protein structure prediction (B)

Molecular docking

The objective of this project is to investigate the interactions between different molecules through the utilization of molecular docking techniques. Specifically, H-DOCK is employed for molecular docking, which encompasses the following steps:

1. Data sources

1.The receptor wGLP-1 was obtained from the Protein Data Bank (PDB: 1D0R), while mGLP-1 was predictedusing Alphafold3. The ligand GLP-1R was derived from UniProtKB (P43220).
2.ACEi is derived from the prediction of Alphafold3, ACE is sourced from UniProtKB (P12821).
3.3-HB was derived through Avogadro mapping, while HACR2 was predicted using Alphafold3.

2. Docking of GLP-1 with DPP-4

We conducted molecular docking analysis of mGLP-1 and wGLP-1 with DPP-4 using HDOCK to evaluate the efficacy of mGLP-1 modification in resisting DPP-4 degradation. Based on the docking results (Figure 2), it was observed that the interaction between wGLP-1 and DPP-4 occurred at residues H1, S8, and I23. On the other hand, mGLP-1 docked with DPP-4 interacted at residues Y13, A19, and F22. While DPP-4 recognizes alanine Ala8 as the degradation site within GLP-1 (7–36), modifications in mGLP-1 may impact interactions near position 8.

Figure 2 HDOCK-based docking of Wglp-1 with DPP-4(A) HDOCK-based docking of mGLP-1 with DPP-4(B)

3. Docking of mGLP-1 with GLP-1R

Similarly, we conducted molecular docking simulations of mGLP-1 and wGLP-1 with GLP-1R using the HDOCK algorithm to evaluate the binding efficacy of the modified mGLP-1.

The results revealed that wGLP-1 interacted with residues H1 and Q17 (Figure 3), while mGLP-1 interacted with residues S8 and Q9. Despite the altered sequence, the overall structure of mGLP-1 remained similar to that of Wglp-1, exhibiting a high Docking Score. Importantly, there was still a strong and stable binding between mGLP-1 and GLP-1R. Therefore, it can be concluded that mutant GLP-1 retains its ability to effectively bind and interact with GLP-R at a certain extent.

Figure 3 HDOCK-based docking of wGLP-1 with GLP-1R(A); HDOCK-based docking of mGLP-1 with GLP-1R(B)

4. Docking of ACEi with ACE

Similarly, we conducted ACEi-ACE docking analysis using the HDOCK software to investigate the interaction between ACEi and the ACE receptor.
The results revealed a strong binding affinity(Figure 4) between ACEi and ACE, as indicated by a high Docking Score of -296.14. Furthermore, the docking results demonstrated multiple robust interaction forces between ACEi and ACE, suggesting an enhanced interaction between these two entities.

Figure 4 HDOCK-based docking of ACEi with ACE

5. Docking of 3-HB with HACR2

Similarly, we utilized the HDOCK software to perform docking simulations of 3-HB with HACR2, aiming to investigate the interaction between 3-HB and the HACR2 receptor. According to our docking results (Figure 5), it is evident that the hydroxyl groups at positions 1 and 5 of 3-HB establish interactions with the HACR2 receptor.

Figure 5 HDOCK-based 3HB interfacing with HACR2

6. Conclusions and outlook

By conducting an analysis of molecular docking results and protein structure prediction as described above, we can preliminarily conclude that the modification of mGLP-1 has been successful. This modification not only effectively counteracts the degrading effect of DPP-4 but also maintains its efficient binding ability with GLP-1R. Furthermore, the analysis of ACEi and its docking with ACE and 3HB using HACR2 provides us with relevant insights.

Moving forward, we will conduct further experimental validation to confirm these predictions and optimize the production processes for mGLP-1 and ACEi. Our aim is to achieve effective application of microbial therapeutics in treating diabetic hypertension. Additionally, we will continue exploring the potential applications of advanced tools such as AlphaFold3 and HDOCK in biopharmaceutical drug discovery and disease mechanism research.

Enzymatic modification of TesB

The enzyme TesB is a medium chain length acyl coenzyme A thioesterase that exhibits broad substrate specificity, primarily catalyzing the hydrolysis of medium and long chain acyl coenzyme substrates into free fatty acids and coenzyme A [6]. It plays a crucial role in the conversion of short-chain length 3-hydroxybutyrate-CoA to its corresponding free fatty acid (3HB) [7], although the current yield is suboptimal. This project aims to enhance the yield of 3-HB through protein modification, thereby improving its production efficiency.

1. Data sources

The UniProt database provides access to basic information about the gene TesB, which is associated with PDB number 1C8U. It also allows for downloading of the structural data and amino acid sequence of the gene (NP_414986.1).

2. Protein structure prediction

First, we obtained the structure of TesB from the PDB database, and then we used AlphaFold to make the structure prediction using the de novo prediction method of TesB. In the following experiments, we selected the protein structure obtained by the AlphaFold2 modeling method (Figure 6).

Figure 6 1C8U-AF from AlphaFold database

3. Molecular docking

Discovery Studio (DS) is a versatile bioinformatics platform designed for complex tasks such as molecular docking, molecular dynamics simulation, pharmacophore analysis, and QSAR modeling. DS's CDOCKER method, which integrates advanced molecular dynamics and simulated annealing techniques, significantly enhances the accuracy of docking, setting it apart from other software in the field.

DS is not only a valuable assistant for researchers but also a powerful tool for drug design and biomedical research. It streamlines the research process, allowing users to easily input the amino acid sequence of the target protein through NCBI's CDD tool, quickly retrieving key binding pocket residue information.

Furthermore,DS's seamless integration with the ChEBI database (https://www.ebi.ac.uk/chebi/init.do) makes it easy to obtain specific substrates such as 3-HB-CoA (CHEBI:57287), ensuring the professionalism and accuracy of the research. DS makes your research work more efficient and precise.

Using DS2021 for flexible molecular docking, and select residues within 5 angstroms of the substrate (Figure 7).

Figure 7 Docking results (left); yellow for residues in the 5Å range (right)

4. Molecular dynamics simulation

Molecular dynamics simulation is a crucial method for comprehending the physical properties of multiparticle systems, such as protein systems, by calculating the temporal evolution of these systems in accordance with Newton's laws of motion [9].

YASARA (http://www.yasara.org/) is a software package designed for scientific research [12]，encompassing an array of tools for molecular visualization, modeling, simulation, and molecular dynamics studies. YASARA offers robust performance across various hardware configurations with effortless GPU acceleration. It supports multiple force fields like AMBER and CHARMM while providing extensive customizability through its YASARA Macro Language.

In this project, we utilized YASARA to conduct kinetic simulations aiming to gather information on protein stability and hydrogen bonding with substrates during the simulation process in order to carry out experiments effectively and optimize results (Figure 8).

Figure 8 Simulation of TesB protein dynamics

5. Mutation site selection

We have adopted a hybrid method that combines sequence and structure to design protein stability, opting for the PROSS method for optimization. This approach utilizes Rosetta software to design the entire protein, while preserving the original state of active or binding sites [10].

The PROSS method involves a two-step screening process, initially identifying all amino acids that might become unstable due to single-point mutations. This screening process significantly narrows down the range of potential mutations, leaving only those predicted to stabilize the protein. Ultimately, Rosetta software selects the best mutation combination from this refined mutation library.

The PROSS method plays a central role in stability design, while the Rosetta design process ensures the precision and effectiveness of protein design. The flowchart (as shown in Figure 10) clearly illustrates this process.

figure 9 The Design flow of the Rosetta

5.1 PSSM Matrix Construction

The PSSM matrix (Position-Specific Scoring Matrix) is a crucial tool for analyzing and describing patterns of conservation within sequences. It is constructed by calculating the frequency of occurrence of different bases (or amino acids) at each position in the sequence, thereby revealing key conserved regions within the sequence.

The steps to construct a PSSM matrix on a Linux system are as follows:

Build NCBI BLAST: First, build the NCBI BLAST tool in the local environment, which serves as the foundation for sequence alignment.
Download the uniref90 database: Next, download the uniref90 database from a public database. The uniref90 is a widely used non-redundant protein sequence database that provides a rich source of data for subsequent sequence alignments.
Use the psiblast tool: Install and use the psiblast tool, which is based on the BLAST algorithm and generates a PSSM matrix through multiple rounds of searching and position-specific scoring.

The working principle of the psiblast tool: psiblast improves search accuracy through an iterative process. It first performs a round of BLAST search, then constructs a PSSM matrix based on the results, and uses this matrix for the next round of searching, continuing until a predetermined number of iterations is reached or a specific stopping condition is met.

Extracting PSSM scores: In the PSSM matrix generated by psiblast, only data with scores greater than or equal to 0 are extracted. These data represent positions with significant conservation in sequence alignments and are key information for constructing the initial mutation library.

Through this process, we can identify conserved regions in the sequence that are crucial for function, providing a foundation for subsequent mutation studies and protein engineering.

5.2 Fireprot Calculations

The PDB number of the protein crystal structure was inputted via Fireprot, and the target protein crystal structure (1C8U) was screened for global stability energy calculations using default parameters and without any restrictions on site selection. The results were obtained as follows (Figure 11):

Figure 10 Process of the Fireprot calculation of the TesB protein

Figure 11 Results of the Fireprot calculation of the TesB protein

5.3 DS Validation

The credibility of the Fireprot calculation data was verified by conducting Stability energy calculations on the Fireprot results using the mutation Energy function of the Macromolecules module in DS 2021, aiming to establish a correlation between the changes in energy and structural stability (Table 2).

Table 2 Calculated Verification of DS2021 Stability

5.4 PROSS calculation

The protein structure and sequence information should be prepared and submitted to the PROSS tool for calculation. The obtained results will then be tabulated and divided into single point mutations, forming alternative mutation libraries. Some of these results are presented in Table 3.

Table 3 Results of TesB protein PROSS calculations

5.5 Amino acid mutation site selection

Research conducted by Prof. Deng Yu's research group on the catalytic mechanism and rational design of β-ketothiolase has unveiled a local cation domain design rule (LCDMR) [11]. By mutating non-conserved residues at the junction of the loop region and α5 helix in the sandwich topology to histidine, it was observed that the Claisen condensation reaction is accelerated through three mechanisms: (1) increasing the volume of the cationic structural domain to provide a wider reaction space for the substrate; (2) enhancing hydrogen bonding interactions between surrounding residues and the substrate; and (3) functioning similarly to H348, which serves as an active center, thereby anchoring the substrate to stabilize tetrahedral intermediates. This principle can be generally applied to enhance enzymatic activity in other β-ketothioneases.

According to Prof. Deng Yu's team modification scheme, we modified specific amino acids around D204, T228, Q278 within proximity of the active site using DS software for docking experiments. Specifically, alternative sites 31, 33, 34, 35, 37, 178, 205, 208 ,219 ,220 ,226 ,227 were preferentially mutated into histidine or aspartic acid in order to increase terminal-to-substrate hydrogen bonding and expand cavity size accordingly. The calculation of cavity size was performed using missense3 software with results displayed in Figure 12.

Figure 12 missense3 cavity calculation results

5.6 Molecular docking validation

We utilized DS2021 for molecular docking of candidate sites and examined the alterations in hydrogen bonding with the substrate molecules post-docking. The outcomes are depicted in Figure 13, showcasing 35D, 178H and 205D as illustrative examples. Our primary focus lies on scrutinizing the variations in interaction forces at the receptor breakage site.

Figure 13 Diagram of molecular docking results of TesB protein and WT F35D, R178H, N205D

5.7 YASARA Dynamics Simulation

We conducted kinetic simulations using YASARA to gather statistical data on protein stability and substrate hydrogen bonding throughout the simulation, as well as parameters including RMSD, electrostatic interactions, and van der Waals interactions. The results are illustrated in Figure 14.

Figure 14 Screenshot of TesB protein mutant YASARA simulation

6. Conclusion Comments

We constructed the PSSM matrix by employing an appropriate protein structure acquisition method, followed by conducting initial sequence analysis of the target proteins. From the resulting PSSM matrix, we selected sites with a hit score greater than or equal to 0 as our initial mutation library. Subsequently, we utilized fireprot and PROSS tools for efficient analysis and identification of sites that can enhance structural stability. In the fireprot calculation, reference sites were chosen based on decreased foldx energy calculations, while in the PROSS results, reference sites were determined based on their occurrence frequency exceeding 2 times. Structural analyses using the PROSS tool provided valuable information regarding catalytic sites and residues within a 5 Å radius around them, facilitating screening and size reduction of our initial mutation library. Missense3 analyses were performed on these mutation library sites to assess potential adverse effects on structure integrity and eliminate those causing drastic structural changes. Furthermore, molecular docking was conducted on remaining sites followed by interaction analysis based on catalytic reaction mechanisms to select suitable mutation sites for final kinetic simulation experiments. Notably, during structural analysis, meticulous attention is required towards understanding the intricate details of protein's catalytic mechanism which can be adjusted according to experimental requirements in order to select a compatible mutation scheme for targeted functional modifications.

Modeling of growth and GLP-1, ACEi, and 3HB fusion protein expression of Zymomonas mobilis in an intestinal pH environment

Zymomonas mobilis is a Gram-negative bacterium that was initially discovered in 1924 in Mexico, specifically in tequila production. This bacterium exhibits the following distinctive characteristics:

Wide range of optimal growing conditions: It thrives well within a temperature range of 24~25 ℃ and pH value of 4.0~8.0.
High metabolic efficiency: It displays remarkable ethanol tolerance, rapid glycolysis rate, and minimal by-product formation during metabolism.
Excellent biosafety profile: Zymomonas mobilis is widely recognized as being safe for consumption (GRAS status).

In this study, Zymomonas mobilis was employed to express three drugs within the human body:

GLP-1 (Glucagon-like peptide-1): A hormone known for its blood glucose-lowering effects.
ACEi (Angiotensin-converting enzyme inhibitors): Medications used to treat cardiovascular diseases like hypertension.
3-Hydroxybutyrate (3-HB): A small ketone molecule with potential medicinal value in managing type II diabetes.

To achieve this, we designed a fusion protein named 4GLP-1-5ACEi by combining four GLP-1 genes with five ACEi genes and introduced it into Zymomonas mobilis for expression purposes.

The objective of this study was to establish a mathematical model elucidating the growth pattern and fusion protein expression dynamics of Zymomonas mobilis while investigating the impact of fusion protein expression on bacterial growth through comparing two strains (ZMNP and ZM4) under empty vector conditions versus those expressing the fusion protein.

1. Experimental data

1.1 Description of data

ZMNP strain: endogenous plasmid was knocked out.

ZM4 strain: the endogenous plasmid was retained.

OD₆₀₀_empty: growth of the bacterium carrying the empty vector.

OD₆₀₀_4GLP-1-5ACEi: growth of the bacterium when expressing the fusion protein.

pH: constant 5.6.

1.2 List of data

(1) ZMNP strain data

Table 4 ZMNP strain data

(2) ZM4 strain data

Table 5 ZM4 strain data

2. Mathematical modeling

2.1 Experimental data

Experimental data will be obtained through quantification of bacterial count, GLP-1 expression and secretion levels, as well as precise pH measurements. The following data is required:

Time (hours): the time point for each experimental measurement;
Bacterial Count (CFU/mL): Zymomonas mobilis count measured by quantitative PCR or viable bacteria counting methods;
Protein Expression (Fluorescence Intensity): amount of expressed GLP-1 determined by quantitative PCR or flow cytometry;
Protein Secretion (Fluorescence Intensity): quantified GLP-1 secretion using anti-GLP-1 antibody combined with a fluorescent protein marker;
pH: growing environment's pH value at corresponding time points measured by a sensor.

2.2 Description of the model

2.2.1 Logistic Growth Model

The growth of Zymomonas mobilis is characterized by the application of the Logistic equation [9] to describe it.

Among them:

N(t) represents the colony count at time t (CFU/mL).
K denotes the environmental capacity, which serves as the upper limit of the bacterial population.
N₀ stands for the initial colony count.
r signifies the constant growth rate.

2.2.2 pH-dependent ordinary differential equations for suicide response

The response of the suicide switch is influenced by variations in ambient pH and can be mathematically described by the following ordinary differential equation.

Among them:

S(t) represents the strength of the suicide switch response.
f(pH(t)) denotes the pH modulation function of the suicide response.
λ is the decay rate constant of the suicide response.

Figure 15 A: Growth curves of Zymomonas mobilis over time to ZMNP(A1) and ZM4(A2) strain; B: Expression curve of Zymomonas mobilis over time to ZMNP(B1) and ZM4(B2) strain, C：Comparison of growth rates, D: Comparison of Carrying Capacities; E: Comparison of Growth Rates with Standard Errors E;

2.2.3 GLP-1 expression and secretion modeling

The expression and secretion of GLP-1 adhere to the principles of Michaelis-Menten kinetic modeling [10].

Among them:

v(t) represents the quantity of GLP-1 expressed or secreted at time t. The quantity of GLP-1 secreted at time t corresponds to the amount of GLP-1 secretion.
V__max denotes the maximum rate of expression.
K_m is a Mie constant that signifies the concentration of the bacterium at which the reaction rate reaches half its maximum value.
delta t is an attenuation term that accounts for the impact of the environment on GLP-1 degradation.

The growth of bacteria can be characterized by a Logistic growth model expressed as follows:

Among them:

N(t) represents the concentration of bacteriophage at time t OD_600nm,
K denotes the maximum carrying capacity of the environment (the upper limit of bacterial concentration),
N₀ represents the initial bacterial concentration,
r is the growth rate constant.

3. Effect of fusion protein expression on growth

Considering that the expression of the fusion protein may impact both the growth rate and maximum carrying capacity of the organism, we have made modifications to our model. For strains expressing the fusion protein, a revised growth model has been developed.

Among them:

N’(t) represents the concentration of the bacteriophage at the time when the fusion protein is expressed.
K’, N₀’ r’ denote the corresponding parameters under conditions of fusion protein expression.

Model assumptions:

The growth of Zymomonas mobilis is modeled by the Logistic equation, which accounts for both environmental carrying capacity and initial colony size.
The activation of the suicide switch is governed by an ordinary differential equation that is influenced by pH levels.
GLP-1 expression and secretion follow Michaelis-Menten kinetics, where bacterial concentration and pH environment are contributing factors.
Prior to the activation of the suicide switch, there is a period of accelerated flora growth and increased protein secretion.

4. Data fitting and analysis

4.1 Data pre-processing

The time should be uniformly converted to hours, with the initial time serving as the reference point.

4.2 Fitting using Python

4.3 Fitting results

ZMNP Empty: K=5.5387033760640865, r=0.24982006276322158, N0=0.17302696136220702
ZMNP Fusion: K=5.058451214165505, r=0.19900791483294195, N0=0.67804958083725
ZM4 Empty: K=4.959759613565381, r=0.13668609447696523, N0=0.18975023544164973
ZM4 Fusion: K=4.850001037723017, r=0.8743368331567228, N0=3.12724295300515e-08

(1) ZMNP strain fitting parameters

Table.6

(2) Fitting parameters for ZM4 strain

Table.7

4.4 Data visualization

(1) Growth curve of ZMNP strain (Figure 17 A1: Growth curve of ZMNP strain)
(2) Growth curve of ZM4 strain (Figure 17 A2: Growth curve of ZM4 strain)
(3) Comparison of growth rates (Figure 17 C: Comparison of growth rates and E: Comparison of Growth Rates with Standard Errors)

The growth rates of the empty vector and fusion protein of ZMNP strain were compared: z = 0.45, p = 0.6514.
The growth rates of the empty vector of ZM4 strain and fusion egg for white were also compared: z = -0.00, p = 0.9999.

(4) Comparison of maximum load capacity: (Figure 17 D: Comparison of Carrying Capacities)

ZMNP Empty compared to ZMNP Fusion showed a significant difference (z = 3.40, p = 0.0007).
On the other hand, there was no significant difference in the maximum load capacity between ZM4 Empty and ZM4 Fusion (z = 0.78, p = 0.4377).

5. Analysis of results

(1) ZMNP strain analysis:
The growth rate (r) of strains expressing the fusion protein was reduced from 0.250 to 0.199, indicating a significant decrease in growth rate (z = 0.45, p = 0.6514, not statistically significant).
Moreover, the maximum carrying capacity (k value) decreased from 5.539 to 5.058 with a significant difference observed (z = 3.40, p = 0.0007).
Conclusion: Expression of the fusion protein significantly inhibited the maximum carrying capacity of the ZMNP strain but had no effect on the growth rate.

(2) ZM4 strain analysis:
The strain expressing the fusion protein showed a significant increase in growth rate as r increased from 0.137 to 0.874 (z = -0.00, p = 0.9999; however, this result may be attributed to parameter estimation error and is not statistically significant).
Additionally, there was a slight reduction in maximum carrying capacity as indicated by a decrease in K value from 4.960 to 4.850 with no statistical significance observed (z = .78, p= .4377).
Conclusion: Expression of the fusion protein exhibited a tendency towards increasing the growth rate of ZM4 strain; however, this effect was not statistically significant and did not impact its maximum carrying capacity.

(3) Comparison of ZMNP and ZM4 strains:
Growth rate differences:
The growth rate of the ZMNP strain was slightly reduced after expression of the fusion protein, but this difference was not statistically significant.
In contrast, the growth rate of the ZM4 strain significantly increased upon expression of the fusion protein; however, due to potential confounding factors (such as error estimation), statistical significance could not be determined.
Difference in maximum carrying capacity:
The maximum carrying capacity of the ZMNP strain was significantly reduced when expressing the fusion protein, indicating an inhibitory effect on bacterial growth.
On the other hand, there was no significant change in maximum carrying capacity observed in the ZM4 strain upon fusion protein expression.
Plasmid effect:
Knocking out endogenous plasmids in the ZMNP strain resulted in a significant inhibitory effect on its maximum carrying capacity when expressing the fusion protein.
Conversely, retaining endogenous plasmids in the ZM4 strain showed minimal impact on its growth upon expression of the fusion protein. This suggests that endogenous plasmids may play a role in mitigating any negative effects caused by fusion protein expression on bacterial growth.

6. Conclusion

In this study, we utilized a mathematical model to simulate the growth and dynamics of GLP-1 expression in Zymomonas mobilis under varying intestinal pH conditions. We incorporated a pH-dependent suicide switch response and validated the model's accuracy using experimental data.

1. The expression of the fusion protein significantly impacted the maximum carrying capacity of the ZMNP strain: In ZMNP strains with knocked-out endogenous plasmids, expressing the fusion protein led to a significant reduction in maximum carrying capacity (p = 0.0007), while not affecting the growth rate.

2. The effect of fusion protein expression on the growth of ZM4 strains was not statistically significant: In ZM4 strains that retained their endogenous plasmids, there was no significant impact on either growth rate or maximum carrying capacity due to fusion protein expression (p > 0.05).

3. The presence of an endogenous plasmid may mitigate negative effects caused by fusion protein expression: It is possible that the endogenous plasmid regulates fusion protein expression, thereby preventing significant disruption to growth in ZM4 strains affected by its presence.

7. Looking Ahead

In this study, we investigated the impact of fusion protein expression on Zymomonas mobilis growth by modeling its growth. Our findings revealed that the expression of fusion proteins had varying effects on different strains, which could be attributed to the presence of endogenous plasmids. Future research should explore further the relationship between plasmid and exogenous gene expression to optimize strain performance.

Subsequent studies will incorporate more experimental data and modeling enhancements for a more precise simulation of Zymomonas mobilis growth and GLP-1 expression and secretion. We intend to introduce diverse regulatory mechanisms to optimize Zymomonas mobilis growth conditions, particularly in complex environments where bacteria interact with hosts. With this model, future probiotic therapies can be developed with greater precision tailored to individual patients' physiological conditions.

Modeling of probiotic therapy based on particle swarm optimization and evolutionary algorithm

This module is designed to simulate the growth and GLP-1 expression of Zymomonas mobilis in patients based on particle swarm optimization (PSO) and evolutionary computational algorithms. Through this model, the amount of plasmid transfection and the initial dose of probiotics can be optimized to ensure the optimal GLP-1 expression in patients with different body weights, ages, genders and other individual differences. At the same time, the suicide switch mechanism simulates the colonization and decay of the bacteria in vivo, providing interpretable modeling results on how the bacteria can be reduced to a safe dose or even completely eliminated after the suicide switch is triggered.

1.Data sources

Personalized inputs based on the patient's actual physiological data, including weight, age, and gender, are utilized to ensure the model's individualization and clinical feasibility.
The experimental group provides data on bacterial population (Zymomonas mobilis), GLP-1 expression and secretion measured through quantitative PCR, flow cytometry, and fluorimetry.
Sensors measure pH values in different environments in conjunction with suicide switch experiments to regulate the timing of suicide switch activation.

2.Mission statement

1. Inputs: patient's weight, age, sex, plasmid transfection volume, initial Zymomonas mobilis concentration, and relevant physiological parameters.
2. Optimization objective: To determine the optimal quantity of plasmid transfection using particle swarm optimization and evolutionary computation techniques in order to achieve the desired therapeutic effect of GLP-1 expression prior to initiating the suicide switch; subsequently calculating colony dynamics through a bacterial reproduction model.
3. Outputs: optimal amount of plasmid transfection, probiotic dosage, decay time of the colony following activation of the suicide switch.

3.Particle Swarm Optimization (PSO) and Evolutionary Computational Modeling

3.1. Particle Swarm Optimization (PSO) Algorithm

The Particle Swarm Optimization (PSO) algorithm is employed to optimize the plasmid transfection efficiency, and the particle's position and velocity are updated using the following formula.

By minimizing the fitness function, the particle population gradually converges towards the optimal level of plasmid transfection.

3.2. Reproduction operators in evolutionary computation

The crossover and mutatiThe evolutionary calculations employ operations to simulate the organism's reproduction and genetic variations in diverse environments.

The equation for the decay of the colony population after the activation of the suicide switch is combined with the response model.

4. Reach a verdict

The integration of PSO and evolutionary computation facilitates the modeling of optimal plasmid transfection and probiotic dosage, taking into account the patient's physiological parameters such as weight, age, and gender. Additionally, this model effectively simulates the decay of bacteria following activation of the suicide switch, aiding in determining the duration required for reducing bacterial levels to a safe dose or complete elimination.

5. Outlook

In the future, additional optimization algorithms such as genetic algorithms or simulated annealing algorithms can be incorporated to enhance both the accuracy and computational efficiency of the model. Furthermore, integrating more biological experimental data and clinical trial data with the model will further augment its reliability and interpretability. Moreover, extending the applicability of this model to other scenarios invoACEiing protein expression and metabolite regulation holds promise for decoupling complex systems and designing engineered bacteria in synthetic biology.

Software

1. Introduction to AlphaFold

The introduction of AlphaFold by DeepMind in 2018 marked a pivotal moment in the field, while the release of AlphaFold 3 in May 2024 represented a significant advancement. This progress can be attributed to substantial enhancements in network architecture and training procedures, enabling the representation of more diverse chemical structures and enhancing learning efficiency from available data. Additionally, AlphaFold 3 incorporates a diffusion model for predicting atom coordinates, replacing the amino acid-level structure module utilized in AlphaFold 2 [13]

Building upon AlphaFold 1, the DeepMind team further optimized the algorithm and introduced AlphaFold 2, which not only achieved remarkable improvements in prediction accuracy but also demonstrated exceptional speed by predicting protein structures within hours. In the CASP14 competition held in 2020, AlphaFold2 outperformed other participants with an accuracy rate exceeding 90%.

Continuing its predecessor's swift prediction capabilities, AlphaFold3 significantly expands its scope to encompass proteins, DNA, RNA, small molecule ligands, etc., thereby providing a powerful tool for life science research and drug design. Moreover, this tool is publicly accessible through the AlphaFold Server—lowering barriers to scientific research and fostering global advancements in biological studies [14]

2. Introduction to HDOCK

In the field of drug design and molecular simulation, computer-aided molecular docking technology plays a pivotal role in predicting the binding modes of drugs and target proteins through simulating intermolecular interactions, evaluating the binding ability of candidate compounds, and providing guidance for further experiments. HDOCK, as an advanced molecular docking software, has emerged as one of the indispensable tools in life science disciplines due to its efficient and accurate docking algorithm [15].

HDOCK is a molecular docking method based on uptake simulations that aims to predict binding patterns between proteins and small molecules [16]. The software assesses the stability and affinity of different complex structures by calculating the free energies associated with all possible conformations in protein-small molecule complexes.

In HDOCK, the interaction forces between molecules are evaluated using Docking Score's precise energy function which encompasses various factors such as hydrogen bonding, van der Waals forces, electrostatic interactions, and hydrophobic effects[16].

3. Introduction to Avogadro

The Avogadro software is a versatile open-source molecular editor that finds applications in various fields such as chemistry, molecular modeling, bioinformatics, materials science, and more. It offers robust capabilities for molecular visualization, modeling, analysis, and data processing on Linux, macOS, and Windows operating systems. Supporting diverse chemical data formats and application packages with an intuitive interface suitable for beginners. The program follows modern programming best practices to ensure efficient source code partitioning and a responsive rendering engine for fast performance and stability. With its C++-based plugin-oriented design approach, Avogadro 2 exhibits high scalability[17].

Our team utilized the Avogadro software to depict the three-dimensional chemical structure of 3HB accurately while visualizing its properties comprehensively. This enabled us to explore the structural characteristics of 3HB effectively while gaining insights into its physical and chemical nature.

4. Introduction to Pymol

The Pymol software is a versatile molecular visualization tool widely used in the fields of biochemistry, structural biology, and computational biology. It supports Linux, macOS, and Windows operating systems and enables the display of three-dimensional structures of various biomolecules such as proteins, nucleic acids, and small molecules. With its diverse range of visualization tools including colors, labels, shading techniques, etc., it offers a comprehensive display and analysis platform. Additionally, Pymol boasts a powerful built-in scripting language that supports Python script programming. This feature allows users to automate complex tasks like molecular structure comparison and analysis of molecular dynamics simulation results effectively enhancing the software's flexibility and scalability [18].

The team utilized Pymol for visualizing produced proteins to facilitate studying structural domains as well as protein docking.

5. Introduction to YASSAR

The YASSAR tool is widely utilized in the field of structural biology for protein structure and complex homology modeling. It employs sequence comparison and modeling algorithms, utilizing known protein structures as templates to accurately predict the structure of target proteins. This versatile tool finds applications in drug design, protein engineering, and education [19].

6. Introduction to Discovery Studio

The Discovery Studio (DS) is a highly sophisticated software for molecular modeling and simulation, extensively utilized in various life science domains such as drug design, protein engineering, structural biology, and bioinformatics. Its key functionalities encompass protein structure prediction, molecular docking, molecular dynamics simulation, pharmacophore modeling, quantitative conformational relationship analysis, and other advanced computational tasks [20].

Reference

[1]. Zhang, Y., Li, Z., Liu, X. et al. 3-Hydroxybutyrate ameliorates insulin resistance by inhibiting PPARγ Ser273 phosphorylation in type 2 diabetic mice. Sig Transduct Target Ther 8, 190 (2023). https://doi.org/10.1038/s41392-023-01415-6

[2]. Naggert J, Narasimhan ML, DeVeaux L, et al. Cloning, sequencing, and characterization of scherichia coli thioesterase II [J]. Biol Chem, 1991, 266(17):11044-50.

[3]. 牧彤.山羊乳降压肽作用机制研究及其微胶囊的制备[D].陕西师范大学,2022.DOI:10.27292/d.cnki.gsxfu.2022.000369. Maruyama, S., & Suzuki, H. (1982)。

[4]. 杜晓静.山羊奶蛋白肽的制备、鉴定及降血糖机制研究[D].江南大学,2023.DOI:10.27169/d.cnki.gwqgu.2023.002568

[5]. 潘琦, 郭立新. 胰高糖素样肽-1受体激动剂的发展历程和临床应用进展 [J] . 中华糖尿病杂志, 2022, 14(12) : 1355-1363. DOI: 10.3760/cma.j.cn115791-20220802-00375.

[6]. Liu Q, Ouyang SP, Chung A, et al. Microbial production of R-3-hydroxybutyric acid by recombinant E. coli harboring genes of phbA, phbB, and tesB [J].Appl Microbiol Biot, 2007, 76(4):811-8.

[7]. Baggio LL, Drucker DJ. Biology of incretins: GLP-1 and GIP. gastroenterology. 2007 May;132(6):2131-57. doi: 10.1053/j.gastro.2007.03.054. PMID: 174985050. 17498508.

[8]. Abramson J, Adler J, Dunger J, et al. Accurate structure prediction of biomolecular interactions with AlphaFold 3[J]. Nature, 2024: 1-3.

[9]. Hollingsworth SA, Dror RO. molecular dynamics simulation for all [J]. Neuron, 2018. 99(6):1129-1143.

[10]. Goldenzweig A, Goldsmith M, Hill SE, et al. Automated structure and sequence-based design of proteins for high bacterial expression and stability [J]. Mol Cell, 2016, 63(2):337-46.

[11]. Li u L, Zhou S, Deng Y. Rational design of the substrate tunnel of β-ketothiolase reveals a local cationic domain modulated rule that improves the efficiency of claisen condensation [J]. Acs Catal, 2023, 13(12): 8183-8194.

[12]. Ozvoldik K, Stockner T, Krieger E. YASARA model-Interactive molecular modeling from two dimensions to virtual realities [J]. J Chem Inf Model, 2023, 63(20):6177-6182.

[13]. Jumper J, Evans R, Pritzel A, et al. Highly accurate protein structure prediction with AlphaFold[J]. nature, 2021, 596(7873): 583-589.

[14]. Abramson J, Adler J, Dunger J, et al. Accurate structure prediction of biomolecular interactions with AlphaFold 3[J]. Nature, 2024: 1-3.

[15]. Yan Y , Tao H , He J ,et al. The HDOCK server for integrated protein-protein docking[J].Nature Protocols, 2020, 15(Suppl 25):1-24.DOI. 10.1038/s41596-020-0312-x.

[16]. Yan Y , Tao H , He J ,et al. The HDOCK server for integrated protein-protein docking[J].Nature Protocols, 2020, 15(Suppl 25):1-24.DOI. 10.1038/s41596-020-0312-x.

[17]. Hanwell, M.D., Curtis, D.E., Lonie, D.C. et al. Avogadro: an advanced semantic chemical editor, visualization, and analysis platform. j Cheminform 4, no. 17 (2012). https://doi.org/10.1186/1758-2946-4-17

[18]. Lane, C., et al. (2017). The PyMOL Molecular Graphics System, version 2.0. Schrödinger, LLC

[19]. Waterhouse A, Bertoni M, Bienert S, Studer G, Tauriello G, Gumienny R, Heer FT, de Beer TAP, Rempfer C, Bordoli L, Lepore R, Schwede T. SWISS-MODEL. Homology modelling of protein structures and complexes. Nucleic Acids Res. 2018 Jul 2;46(W1):W296-W303. doi: 10.1093/nar/gky427. PMID: 29788355. pmcid: pmc6030848.

[20]. Review on Discovery Studio: An important Tool for Molecular Docking,Pawar, Shravani S; Rohane, Sachin H. Asian Journal of Research in Chemistry; Raipur Vol. 14, Iss. 1, (Jan/Feb 2021): 86-88. DOI:10.5958/0974-4150.2021.00014.6