Optimizing Wetlab with Software Analysis
This year, Lambert iGEM simulated CRISPRi and Toehold reactions using MATLAB, and also optimized toeholds using machine learning.
This year, Lambert iGEM is utilizing CRISPR-interference (CRISPRi) to downregulate critical genes within the Mycobacterium tuberculosis bacterium. Therefore, Lambert iGEM was able to provide an alternate treatment to antibiotics, that instead targets the disease head on. There are three main processes involved in the CRISPRi system (see Fig. 1).
Lambert iGEM decided to make a deterministic ordinary differential equation model to simulate our CRISPRi experimentation. As there is minimal research on the experimentation we are undertaking, our model aims to be a benchmark for our wetlab committee. By creating a simulated curve, we will guide CRISPRi results by providing a reference point that gives the CRISPRi team a threshold.to aim for. Once CRISPRi experimentation is completed, their results will also refine our parameters, creating a collaborative data streamline.
To assist in interpreting the CRISPRi reactions, Lambert iGEM applied various deterministic ordinary differential equations (ODE) to create a model that simulates the reaction and correlates fluorescence expressed in the presence of CRISPRi to time passed. Lambert iGEM also contacted Dr. Mark Styczynski from the Georgia Institute of Technology to aid in the identification of certain parameters and to guide us to various research tools and MATlab functionalities to assist in finding rate constants.
We used MATLAB’s SimBiology software to diagram each reaction using the Law of Mass Action. The Mass Action equations are used to diagram the collision of reactants in a solution, which is needed for all the reactants in our reaction pathways. The pathway begins with the introduction of the inhA DNA construct into the system and ends with GFP (see Fig. 2).
The model contains a total of 11 biochemical reactions (see Table. 1) for the CRISPRi pathway, as shown below:
Reaction | Description |
---|---|
[Target Plasmid] → RNAP | RNA Polymerase binds to the target DNA plasmid/construct. |
[dCas9 Plasmid] → RNAP | RNA Polymerase binds pCD017, the plasmid for dCas9 |
[sgRNA gBlock] → RNAP | RNA Polymerase binds to the linear sgRNA DNA gBlock. |
RNAP → [GFP mRNA] | GFP mRNA gets transcribed by RNA Polymerase from the GFP sequence on the target construct. |
RNAP → [dCas9 mRNA] | RNA Polymerase transcribes pCD017 into dCas9 mRNA. |
RNAP → sgRNA | RNA Polymerase transcribes pCD017 into dCas9 mRNA. |
[GFP mRNA] → Ribosome | Ribosomes bind to the GFP mRNA. |
[dCas9 mRNA] → Ribosome | Ribosomes bind to the dCas9 mRNA. |
Ribosome → GFP | Ribosomes translate GFP mRNA into GFP protein. |
Ribosome → dCas9 | Ribosomes translate dCas9 mRNA into dCas9 protein. |
dCas9 ↔ [Target plasmid] | dCas9 inhibits the Target plasmid and falls off after time elapses. |
RecBCD → sgRNA | RecBCD unwinds linear DNA constructs, degrading the sgRNA gBlocks. |
Chi6 → RecBCD | Chi6 binds to RecBCD, preventing it from acting upon DNA constructs. |
In order to simulate our reactions, our team needed to input quantities to initialize each component in our model. The initial value of each species was obtained from CRISPRi wetlab protocols (see Table. 2):
Name | Initial Value | Units |
---|---|---|
Target Plasmid | 1 | nanomoles |
dCas9 Plasmid | 1 | nanomoles |
sgRNA gBlock | 5 | nanomoles |
TXTL RNAP | 40 | molecules |
GFP mRNA | 0 | nanomoles |
dCas9 mRNA | 0 | nanomoles |
sgRNA | 0 | nanomoles |
TXTL Ribosome | 25 | molecules |
GFP | 0 | nanomoles |
dCas9 | 0 | nanomoles |
Chi6 | 2 | micromoles |
RecBCD | 25 | molecules |
Rate constants and degradation rates are required as input in our Mass Action Laws and Michaelis-Menten Equations. These values were derived from literature or estimated if the values could not be found. The value of each parameter is as shown (see Table. 3):
Variable | Reaction | Estimated Value | Units |
---|---|---|---|
k1 | Reaction rate for Target Construct and RNAP binding | 6.7 | mole/second |
k2 | Reaction rate for dCas9 Plasmid and RNAP binding | 6.7 | mole/second |
k3 | Reaction rate for sgRNA gBlock and RNAP binding | 6.7 | mole/second |
k4 | Rate constant for RNAP degradation | 0.0525 | nanomole/hour |
k5 | Rate constant for GFP mRNA degradation | 0.000019254 | 1/second |
k6 | Rate constant for dCas9 mRNA degradation | 0.000019254 | 1/second |
k7 | Reaction rate for GFP mRNA and dCas9 mRNA binding | 100000 | nanomole/hour |
k8 | Rate constant for Chi6 degradation | 0.83 | micromole/second |
k9 | Reaction rate for Chi6 binding to RecBCD | 0.99 | mole/second |
k10 | Rate constant for sgRNA degradation | 0.000019254 | 1/second |
k11 | Reaction rate for free sgRNA and dCas9 binding | 0.98 | mole/second |
k12 | Rate constant for denaturation of dCas9 enzyme | 0.049 | nanomole/hour |
k13 | Rate constant for GFP degradation | 0.056 | nanomole/hour |
Vmtranscript | Maximal rate of reaction needed for Michaelis-Menten enzyme kinetics for TXTL RNAP and dCas9 mRNA binding | 2.49E-12 | nanomole/minute |
Kmtranscript | Michaelis-Menten constant for TXTL RNAP and dCas9 mRNA binding | 40,000 | nanomole |
Vmtranscript1 | Maximal rate of reaction needed for Michaelis-Menten enzyme kinetics for TXTL RNAP and GFP mRNA binding | 0.52 | nanomole/minute |
Kmtranscript1 | Michaelis-Menten constant for TXTL RNAP and GFP mRNA binding | 0.95 | nanomole |
Vm_RecBCD | Maximal rate of reaction needed for Michaelis-Menten enzyme kinetics for RecBCD and sgRNA binding | 0.99 | nanomole/min |
Km_Re | Michaelis-Menten constant for RecBCD and sgRNA binding | 80.5 | nanomole |
Vm_sgRNA | Maximal rate of reaction needed for Michaelis-Menten enzyme kinetics for TXTL RNAP and sgRNA binding | 3.9 | nanomole/min |
Km_sgRNA | Michaelis-Menten constant for TXTL RNAP and sgRNA binding | 1 | nanomole |
Vmtranslate | Maximal rate of reaction needed for Michaelis-Menten enzyme kinetics for TXTL Ribosome and dCas9 binding | 1560 | nanomole/min |
Kmtranslate | Michaelis-Menten constant for TXTL Ribosome and dCas9 binding | 1.0416 | nanomole |
Vm_GFP | Maximal rate of reaction needed for Michaelis-Menten enzyme kinetics for TXTL Ribosome translated to GFP | 150 | nanomole/min |
Km_GFP | Michaelis-Menten constant for TXTL Ribosome translated to GFP | 9.67 | nanomole |
Vm_dCas9 | Maximal rate of reaction needed for Michaelis-Menten enzyme kinetics for dCas9 repression of target construct | 0.162 | nanomole/min |
n_dCas9 | Hill kinetic coefficient for dCas9 repression of target construct | 4.7 | unitless |
Kp_dCas9 | The ligand concentration producing at half occupation for dCas9 repression of target construct | 3.5 | nanomole |
As mentioned previously, Lambert iGEM incorporated the Mass Action Kinetic Laws and Michaelis-Menten Equations to represent the reactions for the CRISPRi model. We generated 12 ordinary differential equations (ODEs) for the CRISPRi reactions, as shown below (see Table. 4):
Number | ODE |
---|---|
1 | |
2 | |
3 | |
4 | |
5 | |
6 | |
7 | |
8 | |
9 | |
10 | |
11 | |
12 |
The CRISPRi ODE model assumes rate constants as continuous, as there are no environmental factors that can influence the reaction in the in vitro cell free mix. Therefore, the model is able to isolate the CRISPRi process to simulate CRISPRi repression activity. The model also keeps the initial concentrations of dCas9 plasmid, target constructs, sgRNA construct, sigma70 ribosome, Chi6, and RecBCD protein constant over time. Additionally, the sigma70 RNA polymerase is expected to synthesize DNA at the same rate regardless of whether the construct being transcribed is a plasmid or linear sequence.
Lambert iGEM utilized MATLAB, a software that generates user-inputted results models, to create an accessible platform that allows for researchers to optimize their gene inputs. Users can input key parameters such as a target gene and the binding coefficient of sgRNA, which would change the inhibition. They can also provide mathematical constants such as the dCas9 hill value (n), derived from the binding coefficient. Through the input of these values, this platform enables precise modeling of gene expression, allowing Lambert iGEM’s baseline model to promote collaboration and sharing of data with anyone in the scientific community.
To test our CRISPRi system, we applied mathematical modeling by creating a complete set of ordinary differential equations (ODEs) to represent GFP expression dynamics. We input the initial concentrations of 1 nanomole of GFP DNA, along with reaction parameters like dCas9, sgRNA, and degradation rates. Using Simbiology’s Model Analyzer, we simulated GFP concentration over 16 hours. The graph shows a rapid initial increase in GFP expression, but around the 4-hour mark, dCas9 begins repressing the inhA gene, causing the graph to plateau as GFP expression is inhibited (see Fig. 3).
In addition to simulating GFP expression and bound DNA, we modeled the behavior of various concentrations of unbound GFP DNA over time. The results displayed a negative logarithmic trend, as the dCas9-sgRNA complex steadily bound to and inhibited GFP DNA. This decrease in unbound GFP DNA highlights the enzymatic activity of dCas9, effectively demonstrating its role in repressing gene expression. As more GFP DNA becomes unavailable for transcription, it showcases how the dCas9-sgRNA complex efficiently curtails GFP expression, reinforcing the observed rapid initial expression followed by inhibition over time (see Fig. 4).
The initial concentration of GFP DNA undergoes a gradual decrease as it becomes bound by the dCas9-sgRNA complex during the CRISPRi reaction. Initially, the unbound GFP DNA is available for transcription, leading to GFP expression. However, as the reaction progresses, dCas9 binds to the target sequences on the GFP DNA, forming a dCas9-sgRNA-GFP complex. This binding reduces the amount of unbound GFP DNA, inhibiting further transcription. The rate at which the GFP DNA becomes bound follows a positive logarithmic pattern, reflecting the progressive, saturating effect of dCas9 inhibition over time (see Fig. 5).
We reconstructed the graph depicting RFU expression from our wetlab’s GFP CRISPRi experimental results to demonstrate that at lower percentages of bound GFP DNA, RFU values remain elevated, indicating active GFP expression. As the binding percentage increases after the initial reaction phases, a noticeable plateau in RFU becomes evident, correlating with the inhibition of GFP transcription. To evaluate the effectiveness of our wetlab experimentation (see CRISPRi GFP), we utilized a line-of-best-fit between GFP expression and fluorescence to correlate the reactions, represented by the equation RFU = 6.1 x [GFP] for Lambert High School’s plate reader and RFU = 231.7 x [GFP] for the plate reader used at the Georgia Institute of Technology. By converting our CRISPRi DNA concentration to fluorescence, the similarity presented by the logarithmic shape of our model and our experimental results proves the accuracy of our wetlab (see Fig. 6). The minor discrepancies in the scales of the graphs could be attributed to potential pipetting errors or differences in the sensitivity or calibration of the plate readers we used.
This graph effectively illustrates the relationship between the percentage of bound GFP DNA and both the experimental and theoretical modeled RFU values. The observed stagnation and decline in fluorescence units with the increase inbound DNA percentage demonstrate the successful application of the CRISPRi mechanism – as the graph shapes are relatively similar to the wetlab results – validating the model’s predictions and emphasizing the efficiency of dCas9 in gene regulation (see Fig. 7). The differences in scales can be attributed to discrepancies in pipetting or differences with each plate reader-run.
Although CRISPR resulted in lower fluorescence, indicating stronger repression, both CRISPR and CRISPRi showed successful outcomes when simulated in silica. The CRISPR reaction simulation predicted around 80-90% repression of deGFP, while CRISPRi resulted in around 60% downregulation. These results demonstrate that while CRISPR achieved greater repression than CRISPRi, both surpassed the 50% threshold for effective GFP repression in the simulation (see Fig. 8).
In order to optimize wetlab outcomes, Lambert iGEM conducted a series of simulations to assess the impact of varying sgRNA concentrations. The concentration tests include 3nm, 5nm, 8nm, and 10nm. After creating a set of models and observing peak areas, Lambert iGEM predicted the values of 246 for 3nm, 397 for 5nm, 256 for 8nm, and 241 for 10nm. This model suggested that the concentration of 5nm of sgRNA would yield the highest peak value, indicating optimal performance (see Fig. 9). The wetlab committee tested the concentrations using the model created, and the experimental concentrations were similar to the predictions. 5nm displayed the highest peak concentration, confirming it as the most optimized sgRNA concentration.
Lambert iGEM also utilized MATLAB to perform precise modeling of gene expression of inhA, a gene linked to the pathogenicity of M. tuberculosis (see CRISPRi M. tb POC). We created a complete set of ODEs and used Simbiology’s Model Analyzer to simulate the concentration of GFP over 20 hours with 1 nanomole of the inhA DNA initially. While producing GFP protein from the inhA target construct, the graph increases at first and then plateaus at around four hours (see Fig. 10).
We also developed a curve that modeled unbounded inhA DNA over time, showing that as time goes on, the dCas9 is able to inhibit the inhA DNA, and as the dCas9 reaches a certain saturation point, the graph goes towards an asymptote at 0 (see Fig. 11).
The initial concentration of inhA DNA undergoes a gradual decrease as it becomes bound by the dCas9-sgRNA complex during the CRISPRi reaction. Initially, the unbound inhA DNA is available for transcription, leading to GFP expression. However, as the reaction progresses, dCas9 binds to the target sequences on the inhA DNA, forming a dCas9-sgRNA-GFP complex. This binding reduces the amount of unbound inhA DNA, inhibiting further transcription. The rate at which the inhA DNA becomes bound follows a positive logarithmic pattern, reflecting the progressive, saturating effect of dCas9 inhibition over time (see Fig. 12).
We reconstructed the graph showing RFU expression from our wetlab’s experimental M. tb CRISPRi results, revealing that at lower percentages of bound inhA DNA, RFU values remain elevated, indicating active GFP expression. As the binding percentage increases beyond the initial reaction phases, notable plateaus in RFU become apparent, correlating with the inhibition of GFP transcription. To evaluate the effectiveness of our wetlab experimentation (see CRISPRi M.tb POC), we utilized a line-of-best-fit between GFP expression and fluorescence to correlate the reactions, represented by the equation RFU = 6.1[GFP] for Lambert High School’s plate reader and RFU = 231.1 [GFP] for the plate reader used at the Georgia Institute of Technology. By converting our CRISPRi DNA concentration to fluorescence, the similarity presented by the logarithmic shape of our model and our experimental results prove the accuracy of our wetlab (See Fig. 13). The minor discrepancies in the scales of the graphs could be attributed to potential pipetting errors or differences in the sensitivity or calibration of the plate readers we used.
This graph effectively illustrates the relationship between the percentage of bound inhA DNA and both the experimental and theoretical modeled RFU values. The observed stagnation and decline in fluorescence units with the increase inbound DNA percentage demonstrate the successful application of the CRISPRi mechanism – as the shapes are relatively similar to the wetlab results – validating the model’s predictions and emphasizing the efficiency of dCas9 in gene regulation (see Fig. 14).
Additionally, while RFU values can vary across different plate readers due to the unique calibration, sensitivity, and gain settings in each one, our graphs serve as reference points for SHIELD, enabling researchers to confirm the shape of their graphs when testing their own effector genes.