Explain your model's assumptions, data, parameters, and results in a way that anyone could understand.
In the computational part of our project, we set out to understand how mutations in promoter binding sites can influence gene expression controlled by different transcription factors (TFs) in response to reactive oxygen species (ROS). We began by developing a computational model that qualitatively demonstrated how decreasing TF binding affinity shifts gene activation to higher ROS levels . This model showed that reducing binding affinity is indeed a viable strategy to fine-tune gene expression in response to different ROS concentrations.
Our initial attempt to quantify the direct impact of promoter mutations on the binding free energy (\(\Delta\Delta G\)) using structural modeling tools was unsuccessful. The limitations arose because these tools could not accurately represent oxidized cysteine residues, which are crucial for the activation of certain TFs like OxyR. Recognizing this challenge, we turned to an alternative approach using Position Specific Scoring Matrices (PSSMs) derived from DNA sequences of known binding sites. By applying this method to the OxyR transcription factor, we were able to design promoter mutations that reduce TF binding affinity. This approach provided a practical way to engineer promoter sequences that modulate gene expression in response to oxidative stress.
We hosted the full code for the modeling and plotting on our GitLab repository. You can visit the repository to see more details on the implementation:
To determine whether changing the binding affinity of the TFs to their respective promoter regions, we developed a computational model based on ordinary differential equations (ODEs). This type of model allows us to simulate the dynamic interactions between ROS, transcription factors, and gene expression within the cell. We aimed to model the transcription factor OxyR including some known kinetic parameters for OxyR. Nevertheless, the qualitative results can still apply to other TFs, as the kinetics can be modeled using adapted parameters.
By simulating these processes, we can investigate how mutations that reduce the binding affinity of transcription factors to their promoters influence the cell's response to oxidative stress.
For full details on the ODEs and parameters, please see the blue collapsed boxes at the end of this section.
We modeled the internal concentration of ROS (\(\text{ROS}_{\text{int}}\)) by considering two main factors:
Since we could not find specific values for the diffusion and degredation rate of ROS in the literature, we treated ROS levels as unitless measures. This simplification allows us to focus on the relative changes in ROS concentrations without needing exact units.
TFs like OxyR are activated by ROS through oxidation. The activated TFs can then bind to promoter regions to regulate gene expression. We assume in our model that the total TF concentration is in steady state, i.e. that the total concentration of activated and inactivated TF remains constant over time
We modeled the activation of the TF by ROS using Hill-like kinetics that incorporate cooperativity:
The activated transcription factors bind to promoter regions to initiate the transcription of target genes, like an antioxidant or GFP.
We used GFP as a reporter gene and modeled its transcription and translation:
To simulate the impact of promoter mutations that reduce transcription factor binding affinity, we adjusted the Michaelis constant \(K_{m_{\text{gfp}}}\) in our model: a higher \(K_{m_{\text{gfp}}}\) represents a lower binding affinity, meaning that a higher concentration of active transcription factor is needed to achieve the same level of binding to the promoter. Figure 1 shows the ratio of GFP expression at steady state at ROS concentrations of 1 divided by the GFP expression at ROS concentrations of 0.5 for different \(K_{m_{\text{gfp}}}\) values ranging from 0 to 1 (see Model Details for more information). Here, a higher ratio indicates a clearer signal specific to higher ROS levels. We can see that reducing the binding affinity shifts the threshold of gene activation to higher ROS concentrations, while starting to plateau at higher \(K_{m_{\text{gfp}}}\) values.
Figure 2 illustrates the simulated GFP expression profiles under different \(K_{m_{\text{gfp}}}\) values represented by curves in different colors. The ROS concentration was increased from 0.1 to 0.5 at time 100, and to 1 at time 200. Again, we observe that lower binding affinities shift the activation of GFP to higher ROS levels. For example, when setting \(K_{m_{\text{gfp}}}\) to 0.01, we can see a clear increase in GFP expression already at a ROS concentration of 0.1, while for \(K_{m_{\text{gfp}}}\) = 1, clear activation occurs only at a ROS concentration of 1. Additionally we can clearly observe a practical problem when setting the binding affinity too low, as the maximal GFP expression remains low even at high ROS concentrations.
When comparing to experimental data where the H2O2 concentration was increased after 2h, we can observe that the qualitative shape of the fluorescence curve is similar to the GFP expression profiles in Figure 2. This indicates that the model captures essential dynamics of the system. You can find the relevant experimental data in Figure 3 of the results page.
Overall, our model demonstrates that reducing the binding affinity of transcription factors to their promoters is a viable strategy to control gene expression in response to ROS. But it also highlights the importance of finding the right balance in finding the optimal binding affinity to achieve specificity to high ROS levels while maintaining sufficient gene expression levels.
While our computational model provides qualitative insights into how decreasing transcription factor (TF) binding affinity can shift gene activation thresholds in response to reactive oxygen species (ROS), several limitations need to be acknowledged:
In total, while our model provides useful qualitative insights into the modulation of gene expression through TF binding affinity in response to ROS, addressing these limitations is necessary for improving its accuracy and applicability. Future work should focus on incorporating more accurate parameters, considering additional regulatory mechanisms, and validating predictions experimentally.
Variable | Description |
---|---|
\(\text{ROS}_{\text{int}}\) | Internal concentration of ROS inside the cell. |
\(\text{ROS}_{\text{ext}}\) | External concentration of ROS outside the cell. This represents the ROS level in the environment that can diffuse into the cell. |
\(\text{TF}_{\text{act}}\) | Concentration of the active form of the transcription factor. This form is capable of binding to promoter regions and initiating gene expression. |
\(\text{TF}_{\text{inact}}\) | Concentration of the inactive form of the transcription factor. This form cannot bind DNA until activated by ROS. |
\(\text{mRNA}_{\text{gfp}}\) | Concentration of mRNA encoding GFP. This reflects the level of gene expression for GFP. |
\(\text{mRNA}_{\text{antiox}}\) | Concentration of mRNA encoding an antioxidant protein. This reflects the level of gene expression for antioxidant proteins, which help detoxify ROS. |
\(\text{GFP}\) | Concentration of GFP protein. GFP is used as a reporter protein to measure gene expression levels. |
\(\text{Antioxidant}\) | Concentration of antioxidant protein. This protein helps neutralize ROS and reduce oxidative stress within the cell. |
\(D\) | Diffusion rate of ROS into the cell. This controls how fast ROS enters the cell from the external environment. |
\(d_{\text{ROS}_{\text{int}}}\) | Detoxification rate of internal ROS. It determines how quickly ROS is neutralized within the cell under the presence of Antioxidants. |
\(k_{\text{ROS}_{\text{int}}}\) | Rate constant for ROS-induced activation of the transcription factor. This determines how fast ROS activates the transcription factor. |
\(K_{m_{\text{TF}}}\) | Michaelis constant for transcription factor activation. It represents the concentration of ROS at which the activation rate is half of its maximum. |
\(n_{\text{TF}}\) | Hill coefficient for transcription factor activation. It indicates the cooperativity of the activation process; higher values suggest stronger cooperative binding. |
\(V_{\text{max}_{\text{gfp}}}\) | Maximum transcription rate of GFP mRNA. This is the highest possible rate at which GFP mRNA can be produced. |
\(K_{m_{\text{gfp}}}\) | Michaelis constant for GFP transcription. It represents the transcription factor concentration at which GFP mRNA production is half of its maximum rate. |
\(n_{\text{gfp}}\) | Hill coefficient for GFP transcription. It reflects the cooperativity of transcription factor binding to the GFP promoter. |
\(V_{\text{max}_{\text{antiox}}}\) | Maximum transcription rate of antioxidant mRNA. This is the highest possible rate at which antioxidant mRNA can be produced. |
\(K_{m_{\text{antiox}}}\) | Michaelis constant for antioxidant mRNA transcription. It represents the transcription factor concentration at which antioxidant mRNA production is half of its maximum rate. |
\(k_{\text{trans}_{\text{gfp}}}\) | Translation rate constant for GFP protein. This controls the rate at which GFP protein is produced from its mRNA. |
\(d_{\text{GFP}}\) | Degradation rate of GFP protein. This represents how quickly GFP protein is broken down and removed from the cell. |
\(k_{\text{trans}_{\text{antiox}}}\) | Translation rate constant for antioxidant protein. This controls the rate at which antioxidant protein is produced from its mRNA. |
\(d_{\text{Antioxidant}}\) | Degradation rate of the antioxidant protein. This represents how quickly the antioxidant protein is broken down and removed from the cell. |
Parameter | Value | Unit | Explanation |
---|---|---|---|
\(D\) | 0.1 | min\(^{-1}\) | Represents an arbitrary diffusion rate. |
\(d\_ROS\_int\) | 0.02 | min\(^{-1}\) | Represents an arbitrary degredation rate. Chosen to roughly match the experimentally observed time until steady state was reached. |
\(k\_ROS\_int\) | 582 | \(\mu\)M\(^{-1}\) min\(^{-1}\) | Derived from the OxyR activation rate, ensuring rapid and efficient transcription factor activation in response to ROS[1]. |
\(d\_TF\_act\) | 3 | min\(^{-1}\) | Set to facilitate timely deactivation of the active transcription factor, aligning with experimental observations of steady-state dynamics. |
\(V\_max\_gfp\) | 4.2 | \(\mu\)M·min\(^{-1}\) | Represents the maximum transcription rate of GFP mRNA, estimated based on typical transcription rates in *E. coli*. |
\(d\_mRNA\_gfp\) | 0.23 | min\(^{-1}\) | Increased to reflect a realistic mRNA degradation rate, ensuring proper turnover and preventing excessive accumulation of GFP mRNA. |
\(V\_max\_antiox\) | 2.1 | \(\mu\)M·min\(^{-1}\) | Estimated maximum transcription rate of antioxidant mRNA, accounting for the induction of protective mechanisms against ROS. |
\(K\_m\_antiox\) | 0.4 | \(\mu\)M | Represents a moderate affinity for antioxidant mRNA synthesis, balancing responsiveness to ROS levels. |
\(d\_mRNA\_antiox\) | 0.23 | min\(^{-1}\) | Set to ensure proper degradation of antioxidant mRNA, maintaining dynamic regulation of antioxidant protein levels. |
\(k\_trans\_gfp\) | 3.78 | min\(^{-1}\) | Estimated translation rate of GFP, facilitating adequate protein synthesis in response to mRNA levels. |
\(d\_GFP\) | 0.2 | min\(^{-1}\) | Increased to ensure timely degradation of GFP protein, preventing unrealistic accumulation and aligning with experimental steady-state observations. |
\(k\_trans\_antiox\) | 3.78 | min\(^{-1}\) | Estimated translation rate of antioxidant protein, ensuring efficient synthesis in response to antioxidant mRNA levels. |
\(d\_Antioxidant\) | 0.2 | min\(^{-1}\) | Increased antioxidant protein degradation rate to maintain appropriate protein levels and dynamic response to oxidative stress. |
\(n\_TF\_act\) | 1.4 | Dimensionless | Hill coefficient for TF activation, indicating slight positive cooperativity in response to ROS levels[1]. |
\(K\_m\_TF\_act\) | 43.0 | \(\mu\)M | Michaelis constant for TF activation, representing the ROS concentration at which TF activation is half-maximal[1]. |
\(n\_mRNA\_gfp\) | 4 | Dimensionless | Fixed Hill coefficient for GFP mRNA synthesis, reflecting cooperative binding effects in transcriptional regulation. |
$$ \frac{d \text{ROS}_{\text{int}}}{dt} = D \left( \text{ROS}_{\text{ext}}(t) - \text{ROS}_{\text{int}} \right) - d_{\text{ROS}_{\text{int}}} \cdot \text{ROS}_{\text{int}} \cdot \frac{ \text{Antioxidant} }{1 + \text{Antioxidant}} $$ This equation describes the change in internal ROS concentration, where \(D\) is the diffusion rate (treated as a scaling factor), and \(d_{\text{ROS}_{\text{int}}}\) represents the detoxification rate of ROS, enhanced by the presence of antioxidants.
$$ \frac{d \text{TF}_{\text{act}}}{dt} = \frac{ k_{\text{ROS}_{\text{int}}} \left( \text{ROS}_{\text{int}} \right)^{ n_{\text{TF}} } \text{TF}_{\text{inact}} }{ K_{m_{\text{TF}}}^{ n_{\text{TF}} } + \left( \text{ROS}_{\text{int}} \right)^{ n_{\text{TF}} } } - d_{\text{TF}} \cdot \text{TF}_{\text{act}} $$ This equation models the activation of the transcription factor by ROS, considering cooperativity in the activation process.
$$ \frac{d \text{TF}_{\text{inact}}}{dt} = -\frac{ k_{\text{ROS}_{\text{int}}} \left( \text{ROS}_{\text{int}} \right)^{ n_{\text{TF}} } \text{TF}_{\text{inact}} }{ K_{m_{\text{TF}}}^{ n_{\text{TF}} } + \left( \text{ROS}_{\text{int}} \right)^{ n_{\text{TF}} } } + d_{\text{TF}} \cdot \text{TF}_{\text{act}} $$ This accounts for the decrease in inactive TF as it is converted to the active form.
$$ \frac{d \text{mRNA}_{\text{gfp}}}{dt} = \frac{ V_{\text{max}_{\text{gfp}}} \left( \text{TF}_{\text{act}} \right)^{ n_{\text{gfp}} } }{ K_{m_{\text{gfp}}}^{ n_{\text{gfp}} } + \left( \text{TF}_{\text{act}} \right)^{ n_{\text{gfp}} } } - d_{\text{mRNA}_{\text{gfp}}} \cdot \text{mRNA}_{\text{gfp}} $$ This models the transcription of GFP mRNA, incorporating cooperativity in TF binding to the promoter.
$$ \frac{d \text{mRNA}_{\text{antiox}}}{dt} = \frac{ V_{\text{max}_{\text{antiox}}} \cdot \text{TF}_{\text{act}} }{ K_{m_{\text{antiox}}} + \text{TF}_{\text{act}} } - d_{\text{mRNA}_{\text{antiox}}} \cdot \text{mRNA}_{\text{antiox}} $$ This describes the transcription of antioxidant mRNA.
$$ \frac{d \text{GFP}}{dt} = k_{\text{trans}_{\text{gfp}}} \cdot \text{mRNA}_{\text{gfp}} - d_{\text{GFP}} \cdot \text{GFP} $$ This models the translation of GFP from its mRNA and its degradation.
$$ \frac{d \text{Antioxidant}}{dt} = k_{\text{trans}_{\text{antiox}}} \cdot \text{mRNA}_{\text{antiox}} - d_{\text{Antioxidant}} \cdot \text{Antioxidant} $$ This models the translation of the antioxidant protein and its degradation.
To determine the steady-state concentrations of \( \text{ROS}_{\text{int}} \), \( \text{TF}_{\text{act}} \), and GFP, we set the derivatives of these variables to zero and solve the resulting equations.
1. Steady-State of \( \text{ROS}_{\text{int}} \):
At steady-state, the rate of change of \( \text{ROS}_{\text{int}} \) is zero:
$$ 0 = D (\text{ROS}_{\text{ext}} - \text{ROS}_{\text{int}}) - d_{\text{ROS}_{\text{int}}}
\cdot \text{ROS}_{\text{int}} \left( \frac{\text{Antioxidant}}{1 + \text{Antioxidant}}
\right) $$
Rearranging terms to solve for \( \text{ROS}_{\text{int}} \):
$$ \text{ROS}_{\text{int}} = \frac{D \cdot \text{ROS}_{\text{ext}}}{D +
d_{\text{ROS}_{\text{int}}} \cdot Q} $$
where \( Q = \frac{\text{Antioxidant}}{1 + \text{Antioxidant}} \).
2. Steady-State of \( \text{TF}_{\text{act}} \):
Balancing the activation and deactivation of the transcription factor at steady-state:
$$ \text{TF}_{\text{act}} = \frac{k_{\text{ROS}_{\text{int}}} \cdot
\text{ROS}_{\text{int}}^{n_{\text{TF}}}}{d_{\text{TF}} \cdot
(K_{m_{\text{TF}}}^{n_{\text{TF}}} + \text{ROS}_{\text{int}}^{n_{\text{TF}}}) +
k_{\text{ROS}_{\text{int}}} \cdot \text{ROS}_{\text{int}}^{n_{\text{TF}}}} $$
3. Steady-State of Antioxidant:
The concentration of Antioxidant at steady-state is derived from the steady-state of its
mRNA:
$$ \text{Antioxidant} = C_{\text{antiox}} \cdot
\frac{\text{TF}_{\text{act}}}{K_{m_{\text{antiox}}} + \text{TF}_{\text{act}}} $$
where \( C_{\text{antiox}} = \frac{k_{\text{trans}_{\text{antiox}}} \cdot
V_{\text{max}_{\text{antiox}}}}{d_{\text{Antioxidant}} \cdot
d_{\text{mRNA}_{\text{antiox}}}} \).
4. Steady-State of GFP:
The steady-state GFP concentration is determined from \( \text{mRNA}_{\text{gfp}} \):
$$ \text{mRNA}_{\text{gfp}} = \frac{V_{\text{max}_{\text{gfp}}} \cdot
\text{TF}_{\text{act}}^{n_{\text{gfp}}}}{K_{m_{\text{gfp}}}^{n_{\text{gfp}}} +
\text{TF}_{\text{act}}^{n_{\text{gfp}}}} \cdot \frac{1}{d_{\text{mRNA}_{\text{gfp}}}} $$
$$ \text{GFP} = \frac{k_{\text{trans}_{\text{gfp}}} \cdot
\text{mRNA}_{\text{gfp}}}{d_{\text{GFP}}} $$
Why Numerical Methods Are Necessary: The steady-state equations exhibit circular dependencies, making it impossible to isolate and solve for each variable independently:
The presence of non-linear relationships and circular dependencies between \( \text{ROS}_{\text{int}} \), \( \text{TF}_{\text{act}} \), and Antioxidant prevents the derivation of closed-form analytical solutions. Numerical methods, such as `fsolve` from the `scipy.optimize` package, help us to iteratively approximate the steady-state concentrations.
Implementation Overview: The steady-state estimation is implemented using the following open source Python packages:
Next, we aimed to model the change in binding affinity of transcription factors to the promoter for mutations in the binding site. We started out by determining the structure of the DNA-TGA2 complex. TGA2 is a repressor that binds to DNA in its reduced state, and disassociates upon oxidation by ROS. We employed AlphaFold 3, a deep learning based tool for protein structure predictions [3]. . Figure 4 illustrates the 3D structures of the TGA2-DNA complex generated from five different AlphaFold model runs. These aligned 3D structures as predicted by AlphaFold show that the TF consistently binds to the same DNA site, demonstrating the robustness of the predictions across model runs.
Following our progress with TGA2, we extended our modeling efforts to other transcription factors, specifically OxyR and Yap1, which are central regulators in bacterial and yeast oxidative stress responses, respectively. Both OxyR and Yap1 require oxidized cysteine residues to undergo conformational changes that enable them to bind DNA. However, modeling these oxidized states presented a significant challenge. AlphaFold 3 cannot explicitly represent oxidized cysteine residues. To overcome this limitation, we explored by the recommendation of Prof. Dr. Pablo Rivera-Fuentes the use of RoseTTAFold All-Atom. This is a structure prediction model similar to AlphaFold 3, but capable of representing atomized versions of amino acid sidechains, including their oxidized states. For this, we had to set up the RoseTTAFold on an HPC cluster, as the no web server was available for this model. Unfortunately, while the provided test examples resulted in high quality structures, the structures for TGA2, OxyR and Yap1 were of insufficient quality for reliable ΔΔG calculations. Figure 6 illustrates one of these predicted structures, where significant inaccuracies are apparent, such as a broken DNA strand. These structural inconsistencies prevented us from obtaining reliable absolute values for changes in ΔΔG, thus halting our progress with these specific transcription factors using this method.
Given the challenges with structural modeling, we adopted an alternative approach to design meaningful mutations for the promoters of these transcription factors, focusing on OxyR as a case study. To start, we obtained a Position Frequency Matrix (PFM) from RegulonDB. RegulonDB is a database that curates regulatory interactions in E. coli, including detailed information about transcription factor binding sites[5]. PFMs are derived from experimental data by aligning known binding site sequences of a given transcription factor. The PFM simply provides the count of each of the four nucleotides at every position in the aligned binding motifs.
Next, we used this data to derive a Position Specific Scoring Matrix (PSSM)[8] as implemented in biopython [9]. The entries of the matrix are calculated based on the observed frequencies of each nucleotide at each position in the PFM, adjusted by the background frequencies of nucleotides in the genome. To account for the general nucleotide composition of the genome, we introduce background frequencies \(p_b\), representing the expected probability of nucleotide \(b\) in a random sequence.
The PSSM value for position \(i\) and base \(b\)is derived by calculating the logarithm of the ratio of the observed frequency \(f_{b,i}\) to the background frequency \(p_b\). Specifically, when using base-2 logarithms to express the scores in bits, the PSSM can be defined as:
$$\text{PSSM}[i,b] = \log_2 \left( \frac{f_{b,i}}{p_b} \right)$$Positive scores indicate that a nucleotide is more frequent at that position in the binding sites than expected by chance, suggesting a favorable binding interaction. Negative scores suggest a lower-than-expected frequency, potentially weakening the interaction.
The score for a specific sequence is computed by summing the individual scores from the PSSM for each nucleotide in the sequence:
$$\text{Score} = \sum_{i=1}^{L} \text{PSSM}[i, \text{sequence}[i]]$$By applying the PSSM to our promoter sequences, we can systematically assess how single nucleotide mutations affect the sequence similarity to known OxyR binding sites, providing indirect information on the expected effect on binding affinity. This analysis allows us to predict and design mutations that reduce the sensitivity of the promoter to oxidative stress, aligning with our goal of fine-tuning the expression profile of target genes in response to varying reactive oxygen species (ROS) levels.
To visualize the nucleotide preferences at each position, we generated a sequence logo (Figure 7) based on the PFM with the help of the MEME Suite [7]. In a sequence logo, the overall height of the stack at each position represents the information content (measured in bits), indicating the degree of conservation at that position, while the height of each nucleotide symbol within the stack reflects its relative frequency[8]. This provides an intuitive visual summary of the binding motif, highlighting the most conserved positions and the preferred nucleotides.
The sequence logo (Figure 7) graphically represents the PFM of OxyR binding sites. Each position in the logo corresponds to a position in the binding site, and the size of each letter indicates the relative frequency of that nucleotide at that position. The total height of the letters at each position reflects the information content, calculated using the formula:
$$I_{\text{seq}}(i) = \sum_{b} f_{b,i} \log_2 \left( \frac{f_{b,i}}{p_b} \right)$$This equation computes the information content \(I_{\text{seq}}(i)\) at position \(i\) by summing over all nucleotides \(b\), where \(f_{b,i}\) is the frequency of nucleotide \(b\) at position \(i\), and \(p_b\) is the background frequency of nucleotide \(b\)[7]. The information content measures how much the nucleotide distribution at a position differs from the background distribution. Positions with high information content are highly conserved and likely crucial for binding specificity.
By analyzing the sequence logo, we can identify positions within the binding site that are highly conserved and therefore critical for OxyR binding. This insight guides the design of promoter mutations by indicating which positions are less likely to tolerate changes without affecting binding affinity.
While the PSSM-based modeling approach provides a quantitative method to predict the impact of mutations on OxyR binding affinity, there are some limitations to consider in the context of our project. Firstly, the PSSM assumes that nucleotide positions contribute independently to the overall binding affinity. However, in reality, there may be interactions between positions (positional dependencies) that affect binding, which the PSSM does not capture.
Secondly, the PSSM is derived from a finite set of known binding sites obtained from RegulonDB. This dataset may not fully represent the complete spectrum of OxyR binding preferences, potentially biasing the PSSM. Any uncharacterized binding sites or sequence variations not included in the dataset could lead to gaps in our understanding of OxyR specificity, affecting the reliability of our predictions.
Additionally, the background nucleotide frequencies \(p_b\) used in the PSSM calculation are assumed to be uniform or based on general genomic composition. However, local variations in nucleotide composition around promoter regions could influence binding site characteristics. Using a more context-specific background model might improve the accuracy but requires additional data that may not be readily available.
Finally, while PSSM scores provide an indication of binding affinity changes due to mutations, they do not directly translate to changes in gene expression levels. The relationship between transcription factor binding affinity and transcriptional output is complex and can be non-linear. Environmental factors, cellular conditions, and feedback mechanisms can all modulate gene expression independently of binding affinity. Here a cell model as described above can help to assess the impact of the mutations on gene expression.
Therefore, experimental validation is essential to confirm the expected effects of designed mutations on promoter activity and gene expression in response to oxidative stress. This we aimed to achieve as described in the next section.
Finally we were able to show that the PSSM scores can be successfully used to design mutations in the promoter region of the OxyR. We used the scores to design mutations in the promoter region of the OxyR promoter. In Figure 8, we show the PSSM scores for the promoter region between the GFP and OxyR coding region. The y-axis represents the scores, and the x-axis represents nucleotide positions relative to the start of the GFP coding region. Red dots show scores higher than seven. We can see that a few targeted point mutations could successfully be used to modifiy the scores. Next, we had to synthesize these sequences with the computationally determined mutations. A technical limitation here is that longer repetitive sequences, like the duplicated promoter region, cannot be synthesized. To avoid this limitation, we had to add additional mutations that should not alter the promoter function. To make sure not only the OxyR, but also the RNA polymerase binding continues to function, we also scored the impact of potential changes on the -10 and -35 binding sites using the same methodology as described above for OxyR. In the wet lab, we were able to verify the functionality of the mutated promoter regions by measuring the expression of the downstream GFP gene. This shows that this method The results of these experiments are presented on the results page.