loading
loading

Overview

Mathematical modeling and computer simulation play a pivotal role in the design of synthetic biology. They enable the simulation of real experimental conditions through mathematical formulas to guide experimental design, validate results, and simulate experiments that cannot be performed in wet labs. In our project, we modeled three components of the plasmid separately.

In the ROS (Reactive Oxygen Species) model, we simulated the changes in ROS concentration during an IBD (Inflammatory Bowel Disease) flare-up within the human body and the activation levels of different promoter sequences. This helped the wet lab group select the appropriate promoter sequences. To explore the micro-characteristics of promoter activation, we first predicted specific binding sites based on motifs, and then used molecular docking to verify the binding affinity.

In the adhesion protein model, we utilized cellular automata to simulate the diffusion and colonization of our engineered bacteria in the intestinal tract, as well as their influence on the colonization of probiotics and pathogenic bacteria. Such simulations can aid in experimental design and mimic the dynamic changes of microbial communities within the real intestine.

For the anti-inflammatory peptide model, we employed an ABM (Agent-Based Model) to elucidate a multifactorial and multiprocess immune response. Additionally, we used protein function evolution techniques to optimize the anti-inflammatory efficacy of our peptides and reduce cytotoxicity, providing insights for subsequent designs.

图片2

The Modeling of Reactive Oxygen Species (ROS) Promoters


1.description


In our project, we conducted an in-depth analysis through modeling to study the concentration of reactive oxygen species (ROS) during enteritis and its impact on the expression levels of specific promoters. In this model, we simulated the dynamic changes in ROS concentration during enteritis based on both exogenous and endogenous factors. Building upon this model, we then considered three key influencing factors—ROS concentration, temperature, and pH—to simulate the expression levels of the three promoters we selected during the course of enteritis. Finally, we plotted the promoter activity throughout the entire enteritis process.

Specifically, we modeled the process of immune cell activation and the suppression of antioxidant enzyme expression by pro-inflammatory cytokines, using modified exponential and logistic functions to simulate the changes in ROS concentration. Additionally, we modeled the fluctuations in temperature and pH to better capture the influence of these external conditions on promoter expression. Ultimately, the model predicted response curves for the three promoters in relation to changes in ROS, temperature, and pH during the course of enteritis, providing a strong basis for further experimental validation.


2.Hypothesis


(1)During the process of inflammatory bowel disease (IBD), ROS concentration is influenced by two factors: endogenous and exogenous factors. For endogenous factors, we considered two categories: the activation of intestinal immune cells, which produce large amounts of ROS, and the suppression of antioxidant enzyme expression by pro-inflammatory cytokines such as tumor necrosis factor-α (TNF-α), interleukin-6 (IL-6), and interleukin-1β (IL-1β), leading to an increase in ROS concentration. As for exogenous factors, we considered the impact of temperature and pH on ROS concentration.

(2)The activation of immune cells, such as neutrophils and macrophages, directly increases ROS production, and this process can be represented by a linear function\( P_{\mathrm{immune}}(t) \).

(3)Pro-inflammatory cytokines such as TNF-α, IL-6, and IL-1β increase ROS concentration by inhibiting the expression of antioxidant enzymes, thereby reducing ROS clearance. This process can also be represented by the function \( P_{\mathrm{cytokine}}(t) \).


3.Parameter of model1


Parameter Value Description Reference
\( \alpha_{\mathrm{immune}} \) 5 The rate of ROS production induced by immune cell activation [1]
\( \beta_{\mathrm{immune}} \) 1 The rate of increase in ROS production during immune cell activation assumption
\( P_{\mathrm{max}} \) 20 The maximum rate of ROS production during immune cell activation assumption
\( \alpha_{\mathrm{cytokine}} \) 2 The proportional constant of the rate of ROS production induced by pro-inflammatory cytokines [2]
\( E_{a} \) 5000 The activation energy for the effect of temperature on ROS production [3]
\( T_{0} \) 310 Initial temperature assumption
\( R_{0} \) 0 Initial ROS concentration assumption
\( \Delta T_{\max} \) 2 The maximum amplitude of temperature variation assumption
\( \gamma \) 0.1 The regulatory constant for the effect of pH on ROS production [4]
\( \mathrm{pH}_0 \) 7.4 Initial pH value assumption
\( \Delta\mathrm{pH}_{\max} \) 0.5 The maximum amplitude of pH variation assumption
\( \omega_{\mathrm{pH}} \) 0.2 The frequency of pH fluctuations assumption
\( \kappa_{\mathrm{pH}} \) 0.1 The decay rate of pH variation assumption
\( K_{2} \) 0.1 The rate constant of natural ROS clearance assumption
\( \kappa_{T} \) 0.3 The rate constant of temperature variation assumption

4.Abbreviations of the model1


\( R(t) \): The ROS concentration at time t.
\( P_{\mathrm{immune}}(t) \): The rate of ROS production induced by immune cell activation.
\( P_{\mathrm{cytokine}}(t) \): The rate of ROS increase due to the inhibition of oxidases by pro-inflammatory cytokines.
\( f_T(T(t)) \): The function describing the effect of temperature on ROS concentration.
\( f_{\mathrm{pH}}(\mathrm{pH}(t)) \): The function describing the effect of pH on ROS concentration
\( \alpha_{i} \): the baseline activity of the promoter.
\( f_{R,i}(R(t)) \): The corresponding function of promoter i in response to ROS concentration
\( f_{T,i}(T(t)) \): The response function of promoter i to temperature.
\( f_{\text{pH},i}(\mathrm{pH}(t)) \): The corresponding function of promoter i in response to pH value.
\( C(t) \): The concentration variation of pro-inflammatory cytokines.

5.Equation


By integrating the effects of both endogenous and exogenous factors on ROS concentration, the rate of change in ROS concentration can be described by the following ordinary differential equation:

\[ \frac{dR(t)}{dt}=(P_{\mathrm{immune}}(t)+P_{\mathrm{cytokine}}(t))\cdot f_T(T(t))\cdot f_{\mathrm{pH}}(\mathrm{pH}(t))-k_2\cdot R(t) \]

Modeling of endogenous influencing factors:

Among them,\( P_{\mathrm{immune}}(t) \) represents the portion of the increase in ROS concentration due to immune cell activation. This typically rises with the inflammatory response, as activated immune cells (such as neutrophils and macrophages) produce large amounts of ROS to combat infections and damaged tissues. However, high ROS concentrations can inhibit the activity of immune cells, preventing an unlimited increase. In conclusion, the rate of ROS production due to immune cell activation initially shows exponential growth before stabilizing.

We used modified exponential and logistic functions to simulate this process:

\[ P_{\mathrm{immune}}(t)=\frac{\alpha_{\mathrm{immune}}\cdot e^{\beta_{\mathrm{imunune}}t}}{1+\frac{\alpha_{\mathrm{immune}}}{P_{\mathrm{max}}}\cdot(e^{\beta_{\mathrm{immune}}t}-1)} \]

The modified exponential function

\[ P_{\mathrm{immune}}(t)=\frac{P_{\max}}{1+e^{-\gamma(t-t_0)}} \]

Logistic function

And we plotted the two function curves respectively:

图片1

We found that the modified exponential function better fits the actual situation: the growth rate is faster, and it reaches a stable state in a shorter period of time.

Next, assuming that the concentration of pro-inflammatory cytokines changes over time, the inhibitory effect on antioxidant enzymes is represented by a linear relationship:

\[ P_{\mathrm{cytokine}}(t)=\alpha_{\mathrm{cytokine}}\cdot C(t) \]

Modeling of exogenous influencing factors:

1.The function representing the effect of temperature\( f_T\left(T\left(t\right)\right) \)

The effect of temperature on ROS production can still be represented using the Arrhenius equation:

\[ f_T(T(t))=e^{-\frac{E_a}{RT(t)}} \]

Where \( \text{Ea} \) is the activation energy, and R is the gas constant.

2.The function representing the effect of pH on ROS production\( f_{\mathrm{pH}}(\mathrm{pH}(t)) \)

Assuming the effect of pH on ROS concentration can be represented by a quadratic function, it simulates the maximum impact within an optimal pH range:

\[ f_{\mathrm{pH}}(\mathrm{pH}(t))=1-\gamma\cdot(\mathrm{pH}(t)-\mathrm{pH}_{opt})^2 \]

Where \( \gamma \) is the adjustment coefficient, and \( \mathrm{pH}_{opt} \) represents the optimal pH value.

The final model equation:

\[ \frac{dR(t)}{dt}=\left(\frac{\alpha_{\mathrm{immma}}\cdot e^{\beta_{\mathrm{mmma}}t}}{1+\frac{\beta_{\mathrm{mmma}}}{P_{\mathrm{max}}}\cdot(e^{\beta_{\mathrm{mmmax}}t}-1)}+\alpha_{\mathrm{cytoline}}\cdot C(t)\right)\cdot e^{-\frac{E_a}{RT(t)}}\cdot\left(1-\gamma\cdot\left(\mathrm{pH}(t)-\mathrm{pH}_{\mathrm{opt}}\right)^2\right)-k_2\cdot R(t) \]

Equation(2): Simulation of the expression of promoters dps, katG, and gorA during the course of enteritis:

During the course of enteritis, the main factors referenced are the dynamic changes in ROS concentration, temperature, and pH. The ROS concentration variation has already been modeled in Equation (1). Next, we only need to model temperature and pH.

Temperature Modeling:

Temperature during the IBD process may be affected by fever or localized inflammation. Let's assume the temperature variation over time follows the model below:

1. The initial temperature \( T_0 \) is the normal human body temperature (approximately 310K).

2. As inflammation progresses, the temperature will rise and gradually stabilize.

The temperature can be represented by a model of asymptotic variation:

\[ T(t)=T_0+\Delta T_{\max}\cdot\left(1-e^{-\kappa_Tt}\right) \]

pH Modeling: The pH of the intestine fluctuates during the course of IBD.

\[ \mathrm{pH}(t)=\mathrm{pH}_0+\Delta\mathrm{pH}_{\max}\cdot\sin(\omega_\text{pH}t)\cdot e^{-\kappa_\text{pH}t} \]

Promoter Activity and Selection Modeling:

Assume that the sensitivity of each promoter to ROS concentration, temperature, and pH can be represented by a response function.

The response function for promoter iii is defined as follows: \( A_i(R(t),T(t),\mathrm{pH}(t)) \) ,Its form is: \( A_{i}(t)=\alpha_{i}\cdot f_{R,i}(R(t))\cdot f_{T,i}(T(t))\cdot f_{\mathrm{pH},i}(\mathrm{pH}(t)) \)

The specific form of the response function:

The response of each promoter to different parameters can be modeled using a logistic function.

\[ f_{R,i}(R(t))=\exp\left(-\frac{(R(t)-R_{\mathrm{opt},i})^2}{2\sigma_{R,i}^2}\right) \]

\[ f_{T,i}(T(t))=\exp\left(-\frac{(T(t)-T_{\mathrm{opt},i})^2}{2\sigma_{T,i}^2}\right) \]

\[ f_{\mathrm{pH},i}(\mathrm{pH}(t))=\exp\left(-\frac{(\mathrm{pH}(t)-\mathrm{pH}_{\mathrm{opt},i})^2}{2\sigma_{\mathrm{pH},i}^2}\right) \]

Where \( R_{\mathrm{opt},i_,} T_{\mathrm{opt},i_,} \mathrm{pH}_{\mathrm{opt},i} \) are the optimal activation conditions for promoter i , and \( \sigma_{R,i},\sigma_{T,i},\sigma_{\mathrm{pH},i} \) are the response width parameters.


6.Results and Analysis


Finally, we obtained the response patterns of the three promoters during the process of enteritis, and the results are shown in the figure below:

图片1

In this figure, we can observe:

The dps promoter reaches its highest expression around t=3 and then rapidly declines. We found that the dps gene is typically associated with bacterial responses to oxidative stress and DNA protection. In the early stages of enteritis, the host or microorganisms may experience an oxidative stress environment, prompting the activation of the dps promoter to a high level to protect the cells. However, as time passes and inflammation subsides, the demand for dps decreases, leading to a rapid decline in its expression.
The expression of the katG promoter reaches its peak at t=4, slightly later than the response of the dps promoter, and its expression level is also higher. Subsequently, its expression rapidly decreases around t=5. Our group reviewed relevant literature and found that the katG gene encodes catalase, which is closely related to the enhanced ability of bacteria to combat oxidative stress. The later response of katG may indicate that during the course of enteritis, the body's antioxidant stress response is further heightened, especially when the pro-inflammatory environment reaches its peak (around t=4). This expression decreases after the oxidative stress peak, possibly because the oxidative stress environment subsides as the inflammation eases.
The response of the gorA promoter is relatively weak, showing a slight increase at t=3, but its activity level remains low and quickly returns to baseline. This may be because the gorA gene is associated with glutathione reductase (GorA), an enzyme involved in antioxidant defense. The low-level response of its promoter may suggest that this antioxidant defense mechanism is not the primary stress response pathway during enteritis, and perhaps in the early stages of enteritis, it has been supplanted by other, more robust antioxidant mechanisms, such as katG.

In summary, we found that katG exhibited the most significant expression during the course of enteritis, reaching its peak in the mid-stage of inflammation, indicating a strong oxidative stress response. In contrast, the expression of the dps and gorA promoters was more moderate, with gorA showing particularly weak response, suggesting its role in the stress response during enteritis may be limited. Based on the significant expression pattern of katG during enteritis and its key regulatory role in oxidative stress response, we ultimately chose katG as the experimental promoter to better study and regulate stress response mechanisms in enteritis.


Screening of ROS-Sensitive Promoters


1.description


Escherichia coli possesses its own unique set of operon systems that respond to high oxidative stress, among which OxyR has garnered our attention as the ROS sensor and transcription activator in E. coli. In the presence of H2O2 or ROS, two cysteine residues (Cys199 and Cys208) of the OxyR protein are oxidized to form a disulfide bond, leading to a conformational change in the protein. This results in binding to promoter sequences of some ROS-responsive genes, subsequently activating the expression of downstream antioxidant genes [5]. The OxyR transcription factor belongs to the LysR-type transcriptional regulators (LTTRs) and is a homologous tetramer, with each subunit comprising an N-terminal DNA-binding domain [6]. Here, we identify the optimal promoter sequence that ensures the highest binding efficiency and specificity with the OxyR transcription factor through transcription factor binding site prediction, molecular docking, and molecular dynamics simulation.


2. Motif-Based Prediction of OxyR Binding Sites


Since we use Escherichia coli as the engineering strain, we first retrieved the target genes and binding motifs of OxyR from the RegulonDB database, which is a transcriptional regulatory network database specific to E. coli (http://regulondb.ccg.unam.mx). As shown in the following diagram obtained from the database, OxyR potentially activates the expression of 32 target genes, represses the expression of 12 genes, has bidirectional interactions with 4 genes, and is associated with 3 operon-related genes. The three candidate promoters of interest in the wet lab also originate from the genes activated by OxyR.

图片1

Fig1.OxyR Regulatory Network. The central gene is OxyR, where genes connected by green lines indicate activation, genes connected by red lines indicate repression, and genes connected by blue lines represent dual functions. Genes highlighted in yellow blocks indicate dual-function genes that also form part of the ROS operon along with OxyR.

Some studies have shown that OxyR can recognize nucleotide motifs composed of ATAGnt elements arranged at 10 bp intervals, but the specific motif remains unknown [7]. We retrieved the upstream sequences (including promoter sequences) of the aforementioned target genes' transcription start sites (TSS) from the National Center for Biotechnology Information (NCBI) and stored them in .fasta file format. Subsequently, all .fasta files were merged into a single large .fasta file. The MEME Suite software allows biologists to discover new motifs in unaligned sets of nucleotide or protein sequences and perform various other motif-based analyses. This suite provides motif discovery algorithms using probabilistic models (MEME) and discrete models (STREME) [8]. After deploying meme (v.5.7.7) on the server and setting the motif length parameter to 10bp~20bp, we performed the prediction and obtained five motifs as shown in the figure below. These motifs are arranged based on their E-value from low to high, with higher E-values indicating motifs that are less likely to have biological significance. Lower scores suggest that the motif was found in more sequences and matches well. The motifs identified from the total of 21 promoter sequences of OxyR target genes are considered potential binding sites where OxyR facilitates transcription activation. We conducted further analysis on these identified sites.

图片1
图片1

Fig2. Predicted Motif Distribution for OxyR Regulation. This figure shows the distribution of the five predicted motifs in the upstream regions of the TSS of various genes. The p-value indicates the probability that a random sequence (of the same length and background) would have a position p-value such that the product is less than or equal to the value calculated for the test sequence. The position p-value is defined as the probability that a random sequence (of the same length and background) matches the tested motif with a score greater than or equal to the maximum value found in the test sequence.

图片1

Fig3.Motif Pattern Diagram We Selected. This motif consists of 19 bases.

For the selected motif, we used FIMO to match the motif's position and matching degree on the promoter sequences of all OxyR target genes, scoring and calculating the p-value. The method involves calculating the score for a sequence's position with the motif by summing the appropriate entries in each column of the position-specific scoring matrix representing the motif. The p-value for motif occurrence is defined as the probability that a random sequence of the same length as the motif would match the sequence at that position with an equal or better score [9]. As shown in the figure below, we can see that this motif scores the highest in the promoter sequence of KatG.

图片1

Fig4.Bar Chart of Motif Matching Scores in Different Sequences. The color intensity represents the p-value, with darker colors indicating lower p-values.


3.Molecular Docking


After obtaining the motif of the targeted sequence for OxyR, we were able to identify the binding site of OxyR in the promoter sequence, which facilitates the selection of docking models. The protein motif of its DNA binding domain obtained from RegulonDB is of the H-T-H type, with the sequence as follows: FRRAADSCHVSQPTLSGQIR. Due to the absence of a complete structural file for OxyR in the PDB database, we turned to the AlphaFold Protein Structure Database to download the high-confidence predicted structure of OxyR (AFDB accession: AF-P0ACQ4-F1-v4). Subsequently, we employed the molecular docking software HDOCK developed by Huang Lab to assess the binding affinity between DNA and protein [10].We first input six candidate DNA sequence files along with the PDB file of the OxyR protein. In setting the binding sites, we entered the motif of OxyR's DNA binding domain. We then submitted the task for molecular docking on the online platform (http://hdock.phys.hust.edu.cn/). Each result provided 100 or more docking models to choose from. Our criterion for model selection was that the binding site of OxyR in the model must correspond to the location of the motif in the promoter. Based on this, the results of docking between the six promoter sequences and the OxyR protein were output as PDB files.We utilized Python (v.3.12.6) and the open-source version of PyMOL (v.3.0.0) for visualization of the PDB files, as shown in the figure below. Each docking model is associated with a docking score and confidence level. A lower docking score indicates tighter binding, while a higher confidence level denotes greater reliability of the docking model. We used bar charts to present the quality of the docking results. From the figure, it is evident that the binding score between the KatG gene's promoter sequence and OxyR is the lowest, with the highest confidence level, demonstrating that the binding affinity is the best. In summary, selecting the promoter sequence of the KatG gene as an element responding to ROS is the optimal choice.

图片1 图片2
图片1 图片2
图片1 图片2
图片1 图片2
图片1 图片2
图片1 图片2

Fig5. The binding structure between the OxyR protein and the six promoter DNA sequences, with the red peptide segments representing the DNA binding domain.

图片1

Fig6. The molecular docking results between the OxyR protein and the six promoter DNA sequences. The docking score is an average value that is also negative; the larger the absolute value, the tighter the binding. A confidence level closer to 1 indicates that the binding model is more reliable.


Reference

[1]Morris G, Gevezova M, Sarafian V, Maes M. Redox regulation of the immune response. Cell Mol Immunol. 2022 Oct;19(10):1079-1101. doi: 10.1038/s41423-022-00902-0. Epub 2022 Sep 2. PMID: 36056148; PMCID: PMC9508259.

[2]Bhatt S, Nagappa AN, Patil CR. Role of oxidative stress in depression. Drug Discov Today. 2020 Jul;25(7):1270-1276. doi: 10.1016/j.drudis.2020.05.001. Epub 2020 May 8. PMID: 32404275.

[3]Vergnolle N. TRPV4: new therapeutic target for inflammatory bowel diseases. Biochem Pharmacol. 2014 May 15;89(2):157-61. doi: 10.1016/j.bcp.2014.01.005. Epub 2014 Jan 16. PMID: 24440740.

[4]Lee S, Shanti A. Effect of Exogenous pH on Cell Growth of Breast Cancer Cells. Int J Mol Sci. 2021 Sep 14;22(18):9910. doi: 10.3390/ijms22189910. PMID: 34576073; PMCID: PMC8464873.

[5]Pomposiello PJ, Demple B. Redox-operated genetic switches: the SoxR and OxyR transcription factors. Trends Biotechnol. 2001 Mar;19(3):109-14. doi: 10.1016/s0167-7799(00)01542-0. PMID: 11179804.

[6]Wei Q, Minh PN, Dötsch A, Hildebrand F, Panmanee W, Elfarash A, Schulz S, Plaisance S, Charlier D, Hassett D, Häussler S, Cornelis P. Global regulation of gene expression by OxyR in an important human opportunistic pathogen. Nucleic Acids Res. 2012 May;40(10):4320-33. doi: 10.1093/nar/gks017. Epub 2012 Jan 24. PMID: 22275523; PMCID: PMC3378865.

[7]Toledano MB, Kullik I, Trinh F, Baird PT, Schneider TD, Storz G. Redox-dependent shift of OxyR-DNA contacts along an extended DNA-binding site: a mechanism for differential promoter selection. Cell. 1994 Sep 9;78(5):897-909. doi: 10.1016/s0092-8674(94)90702-1. PMID: 8087856.

[8]Bailey TL, Boden M, Buske FA, Frith M, Grant CE, Clementi L, Ren J, Li WW, Noble WS. MEME SUITE: tools for motif discovery and searching. Nucleic Acids Res. 2009 Jul;37(Web Server issue):W202-8. doi: 10.1093/nar/gkp335. Epub 2009 May 20. PMID: 19458158; PMCID: PMC2703892.

[9]Grant CE, Bailey TL, Noble WS. FIMO: scanning for occurrences of a given motif. Bioinformatics. 2011 Apr 1;27(7):1017-8. doi: 10.1093/bioinformatics/btr064. Epub 2011 Feb 16. PMID: 21330290; PMCID: PMC3065696.

[10]Yan Y, Tao H, He J, Huang S-Y.* The HDOCK server for integrated protein-protein docking. Nature Protocols, 2020; doi: https://doi.org/10.1038/s41596-020-0312-x.