Our model consists of four parts:
We use the Ribosome Binding Site (RBS) Calculator (https://docs.denovodna.com/docs/rbs-calculator) as our tool to ensure constant promoter functionality. RBS Calculator is an advanced algorithm designed to predict and control translation initiation and protein expression in bacteria over a wide range. Additionally, it can provide insights into transcription intensity.
To have an insight into transcription intensity, we input the gene sequence into the promoter calculator and retrieve the resulting data graph (Figure 3):
Through the application of the RBS-Calculator, we can assess the transcriptional intensity of the sequence under standard conditions. Notably, two distinct convex peaks were observed, corresponding to positions with high expression levels, as predicted by both site data and the principles of the calculator. From these observations, we draw two key conclusions:
a. The two transcription start sites typically exhibit strong expression intensity, highlighting their critical roles within the overall pathway.
b. When compared to the surrounding sites, the significant disparity in intensity suggests that our design remains unaffected, aside from these two prominent expression peaks.
To characterize the function of logic AND gate, we developed a model to simulate the process by which registers influence the output.
We state some assumptions as prerequisites.
a. The enzymes used in transcription and translation, which are not mentioned below, are regarded as sufficient.
b. Reactant of AHL are adequate.
c. Amount of DNA is constant.
d. RNAs do not self-pair.
e. Basal transcription rate of GFP is small enough to be ignored as zero which means that we only calculate the transcription rate of GFP after induction.
f. We control variables for simplification when getting results (See the results section for details).
g. Degradation coefficients and production coefficients of protein and mRNA is considered as constant values.
In this part, we show our mathematical derivation used in getting results.
We divide the biological process into three parts and make mathematical deductions:
a. Expression of LuxR.
b. Generation of AHL.
c. Generation of the complex and production of GFP
The process of expression consists of transcription and translation.
We list the required notations and their explanations as follows.
Notation | Explanation |
---|---|
[LR_d] | Concentration of LuxR's DNA |
[LR_m] | Concentration of LuxR's mRNA |
[LR_p] | Concentration of LuxR's protein |
tcR | Transcription coefficient of LuxR |
tlR | Translation coefficient of LuxR |
dR_m | Degradation coefficient of LuxR's mRNA |
dR_p | Degradation coefficient of LuxR's protein |
Transcription and translation:
The process consists of two parts, expression of LuxI and enzymatic reaction. When LuxI is produced, it as an enzyme reacts with reactant to produce AHL. We list the required notations and their explanations as follows.
Notation | Explanation |
---|---|
[LI_d] | Concentration of LuxI's DNA |
[LI_m] | Concentration of LuxI's mRNA |
[LI_p] | Concentration of LuxI's protein |
[AHL] | Concentration of AHL |
[raw] | Concentration of reactant that produce AHL |
tcI | Transcription coefficient of LuxI |
tLI | Translation coefficient of LuxI |
dI_m | Degradation coefficient of LuxI's mRNA |
dI_p | Degradation coefficient of LuxI's protein |
d_A | Degradation coefficient of AHL |
c_1 | Combination coefficient of LuxI's protein and AHL |
k_A | Production coefficient of AHL |
The expression of LuxI is similar to LuxR, so we directly obtain the following differential equation.
use the Michaelis-Menten equation:
The concentration of reactant is adequate:
This is the main part of our logic AND gate. With both LuxR's protein and AHL, The logic gate opens and exports GFP's protein. The proteins emit fluorescence and represent output. From a quantitative perspective, LuxR's protein and AHL combine one by one, and then combine with the GFP gene in a ratio of two to one.
We list the required notations and their explanations as follows.
Notation | Explanation |
---|---|
[AHL] | Concentration of AHL |
[LR_p] | Concentration of LuxR's protein |
[GFP_d] | Concentration of GFP's DNA |
[GFP_m] | Concentration of GFP's mRNA |
[GFP_p] | Concentration of GFP |
tcP | Transcription coefficient of GFP |
tlP | Translation coefficient of GFP |
dP_m | Degradation coefficient of GFP's mRNA |
dP_p | Degradation coefficient of GFP |
c_2 | Combination coefficient of LuxI's protein, AHL and GFP |
To mathematically derive the generation of the LuxR-AHL-GFP complex, we use Hill equation. Notice that, the the Hill coefficient is 2. Without losing generality and rationality and to simplify the calculation process, we merge the two processes: generation of the complex and production of GFP with two differential equations.
To simulate the process and verify the role of logic AND gates in biological circuits, we try to build a control system to test the relationship between AHL&LuxR and GFP. By mathematical derivation, under the assumption of sufficient reactant, the amount of AHL and LuxI's protein show a linear relationship. To simplify the input setting, the amount of LuxR's DNA and LuxI's DNA are considered as input substances and GFP as our output substance in the control system. Based on the previous assumptions, we use dominant biological reactions as conditions for the control system model. By applying control theory, we use the mathematical derivation to find the following transfer function.
From equation(1) (2), we get the transfer function from the amount of LuxR's DNA to the amount of LuxR's protein. Let [LR_0] be the the amount of produced LuxR's protein during expression. For other notations, we follow the above regulations.
Considering the generation of the complex in the control system, we get the concentration of free LuxR's protein as follows. Let [C] be the concentration of complex of LuxR, AHL and GFP's DNA and [LR_p] be the concentration of free LuxR's protein.
From equation (3) (4) (6) and using the same method, we get the transfer function from the concentration of LuxI's DNA to the concentration of AHL. Also, let [AHL_0] be the concentration of produced AHL without considering consumption and [AHL] be the concentration of free AHL.
From equation (7), we can easily get the transfer function (13) of GFP's mRNA as follows.
Referring to the data of the 2019 iBowu-China Team, we obtain values of parameters and make adjustments based on the specific conditions of our wet lab to fit the result. Substituting parameter values into the model, we get the results.
To calculate the productivity of GFP's protein and predict the efficiency of logic AND gate and output part, we discuss several influential factors.
We use SimBiology to build the biological model and enter the above differential equation. We get some parameters and data from the 2019 iBowu-China Team and set roughly estimated initial quantities from our wet lab. By changing the amount of GFP's DNA, We test GFP expression production over time under different DNA conditions. We can directly find the relationship from Figure 13 and predict the productivity of GFP in terms of different amounts of DNA.
Here we control quantities of other substances as sufficient invariants, and let GFP's DNA be the the independent variable and GFP's protein be dependent variable. Moreover, amounts of those two in magnitude fit the experimental real value. Apparently, within the preset interval, the production of protein increases monotonically with the amount of DNA increasing as expected.
Firstly we use SimBiology to simulate the productivity of GFP with LuxR and AHL and explore their relationship from a temporal perspective. We set some parameters and regular initial values, then get the image of their production over time. From Figure 14, it's apparent that GFP concentration rises after LuxR and AHL amount to a certain level.
Then Simulink is used in this part to simulate.
In the control system, we assume that the input amount of LuxI and LuxR represent the amount of two genes activated in the Register&Patch. By transfer functions (9) and (11), we know that the concentration of AHL and LuxI's DNA shows a linear positive correlation and so does the concentration of LuxR's protein and LuxR's DNA. That is to say, we establish a complete relationship chain from LuxI's DNA and LuxR's DNA to output GFP.
Changing the input amount of LuxI and LuxR, we measure the output amount of GFP's protein and fit it in a 3-dimensional graph (Figure 16), showing their relationship from a productivity perspective. Fortunately, with biological principles and control theory, our simulation result is consistent with our design purpose. GFP expression is strictly regulated by the co-occurrence of LuxI and LuxR. When both inputs increase, their output productivity is significantly higher. That verifies the function of our logic AND gate in the circuit and predicts the productivity of GFP.
We develop the model to verify the inhibit effect of CRISPRi for our wet lab, successfully completing the logical chain of validation for LANTERN.
We use Benchling (https://www.benchling.com) to design our sgRNA. Firstly, we design sgRNA based on the register sequence in the constant promoter region. Then input the whole Register sequence in the Benchling, and use the CRISPR guide to evaluate the on-target scores and off-target scores of designed sgRNA sequences. Finally, optimize the sgRNA sequence based on the evaluation above.
Sequences and evaluation results are shown as follows.
With excellent scores, we ensure the effective and stable combination of sgRNA and its target DNA sequence.
By applying biological principles to build models and matching with wet lab we found that when downstream promoter is inhibited by CRISPRi, the steric hindrance caused by the dCas9 protein prevents the normal expression of the gene of interest (GOI), even when the upstream promoter is actively initiating transcription. To address this, we designed a new circuit incorporating the Patch section to mitigate the steric hindrance effect of dCas9.
Next, we provide a verification of inhibition induced by CRISPRi for our wet lab.
We state some assumptions as prerequisites.
a. The enzymes used in transcription and translation are regarded as sufficient.
b. Concentration of reactant is sufficient enough to be treated as constant.
c. Amount of DNA is constant.
d. Degradation coefficients and production coefficients of Protein and mRNA are considered as constant values.
e. Repression is controlled by a Hill function and depends on the concentration of dCas9 and sgRNA complex.
To simplify our project, we only calculate sgRNA1 here, abbreviated as sgRNA in the following derivation. We list the required notations and their explanations as follows.
Notation | Explanation |
---|---|
[s_d] | Concentration of sgRNA's DNA |
[s_r] | Concentration of sgRNA |
[dc_d] | Concentration of dcas9's DNA |
[dc_m] | Concentration of dcas9's mRNA |
[dc_p] | Concentration of dcas9 |
ts | Transcription coefficient of sgRNA |
td | Transcription coefficient of dcas9 |
ld | Translation coefficient of dcas9 |
ds | Degradation coefficient of sgRNA |
dd_1 | Degradation coefficient of dcas9's mRNA |
dd_2 | Degradation coefficient of dcas9 |
Firstly we use ODE to show the expression process of sgRNA and dcas9.
dcas9 and sgRNA combine and the complex binds to the promoter to exert a repressive effect. We set up differential equations to calculate the expression concentration of GFP. We list the required notations and their explanations as follows.
Notation | Explanation |
---|---|
[com] | Concentration of complex (dcas9 & sgRNA) |
[P_d] | Concentration of GFP's DNA |
[P_m] | Concentration of GFP's mRNA |
[P_p] | Concentration of GFP |
k | Combination coefficient of complex |
tp | Transcription coefficient of GFP |
lp | Translation coefficient of GFP |
dc | Dissociation coefficient of complex |
dp_1 | Degradation coefficient of GFP's mRNA |
dp_2 | Degradation coefficient of GFP |
sgRNA and dcas9 combine and we calculate the concentration of the complex changing over time as follows.
To simplify our model without losing rationality and simulate the process of inhibition, we use the Hill equation, and then we set up ODE as followed. We set h as the Hill coefficient and obtain its value and values of certain parameters from the 2013 UCSF Team. The expression rate of GFP under combination with the complex is as follows.
We use SimBiology to solve ODE and simulate the production of GFP. We draw change curves of GFP in two conditions: with and without expression of dcas9&sgRNA as follows (Figure 22&23).
With an expression of dcas9&sgRNA, it's obvious that the amount of GFP significantly reduces and then amounts of three substances reach equilibrium. Moreover, to confirm the generality of the inhibitory effect, the concentration of GFP's DNA is changed and we calculate the concentration of GFP under CRISPRi. As we thought, the concentration of GFP decreases despite how much DNA we enter(Figure 24). We conclude that the combination of dcas9 and sgRNA surely inhibits the expression of GFP, that is to say, CRISPRi acts as our expectation in this part and implements the function of NOT gate in our circuit.
We try to estimate the time consumed by the entire biological logic circuit to match the result with wet lab and want to propose suggestions for reducing wasted time. So we build models for two processes, which are a diffusion of molecules and Logic AND Gate. About Register & Patch section, too many effect factors make it difficult to calculate time accurately, so we need extensive research and study to explore this complex part in the future.
We simply calculate how long it would take for the inducers to penetrate the cell membrane and enter the cell. We intercept the cell membrane per unit area(1 square micron). Take a square inside and outside the cell membrane, each with a volume of one cubic micron, named cube A and cube B. Without loss of generality, we take the thickness of the cell membrane to be seven nanometers. Our estimation model is built based on the unit geometry. By calculating the diffusion time in cubes, we can roughly estimate the total time by linear superposition. By matching with wet lab, we obtain an average concentration of inducers, which is about 6E-7 mol/L. The diffusion coefficient of molecules inside and outside the cell (in cube A and cube B) is set to 1E-9, while the one in the cell membrane is set to 1E-12. Before estimation, we state some assumptions as follows.
a. The flux at the upper bound of cube A is considered to be a constant value.
b. The lower bound of cube B is considered an open border.
c. The left and right boundaries are regarded flux-free because they can be seen as canceling each other out.
d. The diffusion process is uniform and there will be no sudden increase in concentration.
e. The process of molecular diffusion is uniform and continuous
f. The process of molecules passing through the cell membrane is considered free diffusion.
COMSOL is used to simulate the diffusion process.
The process takes about 0.01s. The volume of the system is at the microliter level. As a result, we can believe that the total time of diffusion of inducers can be ignored.
The processes involved are expressions of LuxI and LuxR, formation of a complex of LuxR's protein and AHL, and expression of GFP. To simplify our estimation without loss of rationality, we follow the assumptions stated in the Logic AND Gate part in predictions of productivity. Especially, rates of the expression processes involved are regarded as linear variables in our assumptions. We enter the mathematical derivation and values of parameters in the Logic AND Gate part into SimBiology and estimate the time taken for output. The graph SimBiology predicts and draws is as follows (Figure 26).
According to the value of the dependent variable in the graph, we estimate that the process takes almost 60 seconds.
By matching with wet lab, we know that the complete experiment takes several hours. Based on our estimation of time, we speculate that Register&Patch is the most time-consuming part. We hope that by extensive research and study, we can find an approximation algorithm with error multiple controlled at a constant level to appropriately predict the time and find ways of reducing wasted time to optimize our biological circuits.