Model | UCAS-China

Abstract

Figure 1: Abstract

Description

Our model consists of four parts:

Ensuring constant promoter functionality. To evaluate transcriptional intensity, we utilized the RBS Calculator to analyze the gene sequences. Based on the results, we concluded that the constant promoters functioned as expected, meeting the desired performance criteria.
Characteristics of logic AND Gate. We performed mathematical derivation based on our assumptions and build a control system model to simulate and calculate the numerical relationship between GFP output and various input quantities. Usings parameter values referenced from the 2019 iBowu-China Team, we modeled the function of the logic AND gate for our project with SimBiology and Simulink.
Demonstrating the inhibition induced by CRISPRi. By using Benchling, we designed and optimized the sgRNA sequence. Based on the Hill coefficient and parameter values from the 2013 UCSF Team's model, we build a model with SimBiology. We successfully confirm its inhibitory effect on the productivity of GFP for wet lab.
Estimation of time consumed by the boolean logic circuit system. We used COMSOL to calculate the inducers diffusion time and found it to be negligible. Moreover, we estimated that the process of Output takes about 60 seconds. So by matching with wet lab, we conclude that the process of Register&Patch takes much time and we hope we can find ways in future studies to reduce wasted time if an approximation algorithm with error multiple controlled at a constant level is found.

Ensuring Functionality of Constant Promoter

RBS-Calculator

We use the Ribosome Binding Site (RBS) Calculator (https://docs.denovodna.com/docs/rbs-calculator) as our tool to ensure constant promoter functionality. RBS Calculator is an advanced algorithm designed to predict and control translation initiation and protein expression in bacteria over a wide range. Additionally, it can provide insights into transcription intensity.

Application of the RBS calculator in LANTERN

To have an insight into transcription intensity, we input the gene sequence into the promoter calculator and retrieve the resulting data graph (Figure 3):

Figure 2: Pathway diagram of the constant promoter

The outcome derived from the RBS calculator

Agarose gel electrophoresis of extended PCR fragment

Figure 3: Transcription start sites(TSS) nucleotide position

Through the application of the RBS-Calculator, we can assess the transcriptional intensity of the sequence under standard conditions. Notably, two distinct convex peaks were observed, corresponding to positions with high expression levels, as predicted by both site data and the principles of the calculator. From these observations, we draw two key conclusions:

a. The two transcription start sites typically exhibit strong expression intensity, highlighting their critical roles within the overall pathway.

b. When compared to the surrounding sites, the significant disparity in intensity suggests that our design remains unaffected, aside from these two prominent expression peaks.

Characteristics of Logic AND Gate

To characterize the function of logic AND gate, we developed a model to simulate the process by which registers influence the output.

Assumptions

We state some assumptions as prerequisites.

a. The enzymes used in transcription and translation, which are not mentioned below, are regarded as sufficient.

b. Reactant of AHL are adequate.

c. Amount of DNA is constant.

d. RNAs do not self-pair.

e. Basal transcription rate of GFP is small enough to be ignored as zero which means that we only calculate the transcription rate of GFP after induction.

f. We control variables for simplification when getting results (See the results section for details).

g. Degradation coefficients and production coefficients of protein and mRNA is considered as constant values.

Mathematical derivation

In this part, we show our mathematical derivation used in getting results.

We divide the biological process into three parts and make mathematical deductions:

a. Expression of LuxR.

b. Generation of AHL.

c. Generation of the complex and production of GFP

a. Expression of LuxR

The process of expression consists of transcription and translation.

We list the required notations and their explanations as follows.

Notation	Explanation
[LR_d]	Concentration of LuxR's DNA
[LR_m]	Concentration of LuxR's mRNA
[LR_p]	Concentration of LuxR's protein
tcR	Transcription coefficient of LuxR
tlR	Translation coefficient of LuxR
dR_m	Degradation coefficient of LuxR's mRNA
dR_p	Degradation coefficient of LuxR's protein

Transcription and translation:

Figure 5: Expression of LuxR

$$ \begin{equation} \frac{d[LR_m]}{dt} = tcR*[LR_d]-dR_m*[LR_m] \tag{1} \end{equation} $$

$$ \frac{d[LR_p]}{dt}=tlR*[LR_m]-dR_p*[LR_p] \tag{2} $$

b. Generation of AHL

The process consists of two parts, expression of LuxI and enzymatic reaction. When LuxI is produced, it as an enzyme reacts with reactant to produce AHL. We list the required notations and their explanations as follows.

Notation	Explanation
[LI_d]	Concentration of LuxI's DNA
[LI_m]	Concentration of LuxI's mRNA
[LI_p]	Concentration of LuxI's protein
[AHL]	Concentration of AHL
[raw]	Concentration of reactant that produce AHL
tcI	Transcription coefficient of LuxI
tLI	Translation coefficient of LuxI
dI_m	Degradation coefficient of LuxI's mRNA
dI_p	Degradation coefficient of LuxI's protein
d_A	Degradation coefficient of AHL
c_1	Combination coefficient of LuxI's protein and AHL
k_A	Production coefficient of AHL

The expression of LuxI is similar to LuxR, so we directly obtain the following differential equation.

$$ \begin{equation} \frac{d[LI_m]}{dt} = tcI*[LI_d]-dI_m*[LI_m] \tag{3} \end{equation} $$

$$ \frac{d[LI_p]}{dt}=tLI*[LI_m]-dI_p*[LI_p] \tag{4} $$

use the Michaelis-Menten equation:

$$ \frac{d[AHL]}{dt}=\frac{k_A*c_1*[LI_p]*[raw]}{1+c_1*[raw]}-d_A*[AHL] \tag{5} $$

The concentration of reactant is adequate:

$$ \lim_{[raw] \to +\infty}\frac{k_A*c_1*[LI_p]*[raw]}{1+c_1*[raw]}=k_A*[LI_p] $$

$$ \frac{d[AHL]}{dt}=k_A*[LI_p]-d_A*[AHL] \tag{6} $$

c. Generation of the complex and production of GFP

This is the main part of our logic AND gate. With both LuxR's protein and AHL, The logic gate opens and exports GFP's protein. The proteins emit fluorescence and represent output. From a quantitative perspective, LuxR's protein and AHL combine one by one, and then combine with the GFP gene in a ratio of two to one.

Figure 6: LuxR and AHL induce the production of GFP

We list the required notations and their explanations as follows.

Notation	Explanation
[AHL]	Concentration of AHL
[LR_p]	Concentration of LuxR's protein
[GFP_d]	Concentration of GFP's DNA
[GFP_m]	Concentration of GFP's mRNA
[GFP_p]	Concentration of GFP
tcP	Transcription coefficient of GFP
tlP	Translation coefficient of GFP
dP_m	Degradation coefficient of GFP's mRNA
dP_p	Degradation coefficient of GFP
c_2	Combination coefficient of LuxI's protein, AHL and GFP

To mathematically derive the generation of the LuxR-AHL-GFP complex, we use Hill equation. Notice that, the the Hill coefficient is 2. Without losing generality and rationality and to simplify the calculation process, we merge the two processes: generation of the complex and production of GFP with two differential equations.

$$ \frac{d[GFP_m]}{dt}=\frac{tcP*[GFP_d]*(c_2*[LI_p]*[AHL])^2}{1+(c_2*[LI_p]*[AHL])^2}-dP_m*[GFP_m] \tag{7} $$

$$ \frac{d[GFP_p]}{dt}=tlP*[GFP_m]-dP_p*[GFP_p] \tag{8} $$

Calculation for simulation

To simulate the process and verify the role of logic AND gates in biological circuits, we try to build a control system to test the relationship between AHL&LuxR and GFP. By mathematical derivation, under the assumption of sufficient reactant, the amount of AHL and LuxI's protein show a linear relationship. To simplify the input setting, the amount of LuxR's DNA and LuxI's DNA are considered as input substances and GFP as our output substance in the control system. Based on the previous assumptions, we use dominant biological reactions as conditions for the control system model. By applying control theory, we use the mathematical derivation to find the following transfer function.

From equation(1) (2), we get the transfer function from the amount of LuxR's DNA to the amount of LuxR's protein. Let [LR_0] be the the amount of produced LuxR's protein during expression. For other notations, we follow the above regulations.

Figure 7: LuxR input

$$ G=\frac{C(s)}{R(s)}=\frac{[LR_0]}{[LR_d]}=\frac{tcR*tlR}{(s+dR_m)(s+dR_p)} \tag{9} $$

Considering the generation of the complex in the control system, we get the concentration of free LuxR's protein as follows. Let [C] be the concentration of complex of LuxR, AHL and GFP's DNA and [LR_p] be the concentration of free LuxR's protein.

Figure 8: LuxR in the control system

$$ [LR_p]=[LR_0]-2[C] \tag{10} $$

From equation (3) (4) (6) and using the same method, we get the transfer function from the concentration of LuxI's DNA to the concentration of AHL. Also, let [AHL_0] be the concentration of produced AHL without considering consumption and [AHL] be the concentration of free AHL.

Figure 9: LuxI input

$$ \frac{[AHL_0]}{[LI_d]}=\frac{k_A*tcI*tlI}{(s+dI_m)(s+dRI_p)} \tag{11} $$

Figure 10: LuxI input

$$ [AHL]=[AHL_0]-2[C] \tag{12} $$

From equation (7), we can easily get the transfer function (13) of GFP's mRNA as follows.

Figure 11: LuxR and AHL induce the production of GFP

$$ \frac{[GFP_m]}{[GFP_d]}=\frac{tcP*(c_2*[LI_p]*[AHL])^2}{1+(c_2*[LI_p]*[AHL])^2}*\frac{1}{s+dP_m} \tag{13} $$

Figure 12: GFP output

$$ \frac{[GFP_p]}{[GFP_m]}=\frac{tlP}{s+dP_P} \tag{14} $$

Referring to the data of the 2019 iBowu-China Team, we obtain values of parameters and make adjustments based on the specific conditions of our wet lab to fit the result. Substituting parameter values into the model, we get the results.

Results

To calculate the productivity of GFP's protein and predict the efficiency of logic AND gate and output part, we discuss several influential factors.

GFP-DNA

We use SimBiology to build the biological model and enter the above differential equation. We get some parameters and data from the 2019 iBowu-China Team and set roughly estimated initial quantities from our wet lab. By changing the amount of GFP's DNA, We test GFP expression production over time under different DNA conditions. We can directly find the relationship from Figure 13 and predict the productivity of GFP in terms of different amounts of DNA.

Figure 13: Relationship between GFP and its DNA

Here we control quantities of other substances as sufficient invariants, and let GFP's DNA be the the independent variable and GFP's protein be dependent variable. Moreover, amounts of those two in magnitude fit the experimental real value. Apparently, within the preset interval, the production of protein increases monotonically with the amount of DNA increasing as expected.

GFP-register&patch

Firstly we use SimBiology to simulate the productivity of GFP with LuxR and AHL and explore their relationship from a temporal perspective. We set some parameters and regular initial values, then get the image of their production over time. From Figure 14, it's apparent that GFP concentration rises after LuxR and AHL amount to a certain level.

Figure 14: Productivity of three substances over time

Then Simulink is used in this part to simulate.

Figure 15: The control system

In the control system, we assume that the input amount of LuxI and LuxR represent the amount of two genes activated in the Register&Patch. By transfer functions (9) and (11), we know that the concentration of AHL and LuxI's DNA shows a linear positive correlation and so does the concentration of LuxR's protein and LuxR's DNA. That is to say, we establish a complete relationship chain from LuxI's DNA and LuxR's DNA to output GFP.

Changing the input amount of LuxI and LuxR, we measure the output amount of GFP's protein and fit it in a 3-dimensional graph (Figure 16), showing their relationship from a productivity perspective. Fortunately, with biological principles and control theory, our simulation result is consistent with our design purpose. GFP expression is strictly regulated by the co-occurrence of LuxI and LuxR. When both inputs increase, their output productivity is significantly higher. That verifies the function of our logic AND gate in the circuit and predicts the productivity of GFP.

Figure 16: Relationship between GFP and LuxR&AHL

Demonstrating the Inhibition induced by CRISPRi

We develop the model to verify the inhibit effect of CRISPRi for our wet lab, successfully completing the logical chain of validation for LANTERN.

Figure 17: CRISPR interference

Benchling

We use Benchling (https://www.benchling.com) to design our sgRNA. Firstly, we design sgRNA based on the register sequence in the constant promoter region. Then input the whole Register sequence in the Benchling, and use the CRISPR guide to evaluate the on-target scores and off-target scores of designed sgRNA sequences. Finally, optimize the sgRNA sequence based on the evaluation above.

Figure 18: Benchling

Sequences and evaluation results are shown as follows.

Figure 19: Sequences

Figure 20: Results

With excellent scores, we ensure the effective and stable combination of sgRNA and its target DNA sequence.

Circuit design

By applying biological principles to build models and matching with wet lab we found that when downstream promoter is inhibited by CRISPRi, the steric hindrance caused by the dCas9 protein prevents the normal expression of the gene of interest (GOI), even when the upstream promoter is actively initiating transcription. To address this, we designed a new circuit incorporating the Patch section to mitigate the steric hindrance effect of dCas9.

Figure 21: New circuit

Next, we provide a verification of inhibition induced by CRISPRi for our wet lab.

Assumptions

We state some assumptions as prerequisites.

a. The enzymes used in transcription and translation are regarded as sufficient.

b. Concentration of reactant is sufficient enough to be treated as constant.

c. Amount of DNA is constant.

d. Degradation coefficients and production coefficients of Protein and mRNA are considered as constant values.

e. Repression is controlled by a Hill function and depends on the concentration of dCas9 and sgRNA complex.

Mathematical derivation

To simplify our project, we only calculate sgRNA1 here, abbreviated as sgRNA in the following derivation. We list the required notations and their explanations as follows.

Notation	Explanation
[s_d]	Concentration of sgRNA's DNA
[s_r]	Concentration of sgRNA
[dc_d]	Concentration of dcas9's DNA
[dc_m]	Concentration of dcas9's mRNA
[dc_p]	Concentration of dcas9
ts	Transcription coefficient of sgRNA
td	Transcription coefficient of dcas9
ld	Translation coefficient of dcas9
ds	Degradation coefficient of sgRNA
dd_1	Degradation coefficient of dcas9's mRNA
dd_2	Degradation coefficient of dcas9

Firstly we use ODE to show the expression process of sgRNA and dcas9.

$$ \frac{d[s_r]}{dt}=ts*[s_d]-ds*[s_r] \tag{15} $$

$$ \frac{d[dc_m]}{dt}=td*[dc_d]-dd_1*[dc_m] \tag{16} $$

$$ \frac{d[dc_p]}{dt}=ld*[dc_m]-dd_2*[dc_p] \tag{17} $$

dcas9 and sgRNA combine and the complex binds to the promoter to exert a repressive effect. We set up differential equations to calculate the expression concentration of GFP. We list the required notations and their explanations as follows.

Notation	Explanation
[com]	Concentration of complex (dcas9 & sgRNA)
[P_d]	Concentration of GFP's DNA
[P_m]	Concentration of GFP's mRNA
[P_p]	Concentration of GFP
k	Combination coefficient of complex
tp	Transcription coefficient of GFP
lp	Translation coefficient of GFP
dc	Dissociation coefficient of complex
dp_1	Degradation coefficient of GFP's mRNA
dp_2	Degradation coefficient of GFP

sgRNA and dcas9 combine and we calculate the concentration of the complex changing over time as follows.

$$ \frac{d[com]}{dt}=k*[s_r]*[dc_p]-dc*[com] \tag{18} $$

To simplify our model without losing rationality and simulate the process of inhibition, we use the Hill equation, and then we set up ODE as followed. We set h as the Hill coefficient and obtain its value and values of certain parameters from the 2013 UCSF Team. The expression rate of GFP under combination with the complex is as follows.

$$ \frac{d[P_m]}{dt}=[P_d]*tp*(1-\frac{[com]^h}{dc^n+[com]^h})-dp_1*[P_m] \tag{19} $$

$$ \frac{d[P_p]}{dt}=lp*[P_m]-dp_2*[P_p] \tag{20} $$

Results

We use SimBiology to solve ODE and simulate the production of GFP. We draw change curves of GFP in two conditions: with and without expression of dcas9&sgRNA as follows (Figure 22&23).

Figure 22: [GFP]-t relationship without CRISPRi

Figure 23: Concentration-t relationship with CRISPRi

With an expression of dcas9&sgRNA, it's obvious that the amount of GFP significantly reduces and then amounts of three substances reach equilibrium. Moreover, to confirm the generality of the inhibitory effect, the concentration of GFP's DNA is changed and we calculate the concentration of GFP under CRISPRi. As we thought, the concentration of GFP decreases despite how much DNA we enter(Figure 24). We conclude that the combination of dcas9 and sgRNA surely inhibits the expression of GFP, that is to say, CRISPRi acts as our expectation in this part and implements the function of NOT gate in our circuit.

Figure 24: Relationship between GFP and its DNA with CRISPRi

Estimation of Time

We try to estimate the time consumed by the entire biological logic circuit to match the result with wet lab and want to propose suggestions for reducing wasted time. So we build models for two processes, which are a diffusion of molecules and Logic AND Gate. About Register & Patch section, too many effect factors make it difficult to calculate time accurately, so we need extensive research and study to explore this complex part in the future.

Diffusion of inducers

We simply calculate how long it would take for the inducers to penetrate the cell membrane and enter the cell. We intercept the cell membrane per unit area(1 square micron). Take a square inside and outside the cell membrane, each with a volume of one cubic micron, named cube A and cube B. Without loss of generality, we take the thickness of the cell membrane to be seven nanometers. Our estimation model is built based on the unit geometry. By calculating the diffusion time in cubes, we can roughly estimate the total time by linear superposition. By matching with wet lab, we obtain an average concentration of inducers, which is about 6E-7 mol/L. The diffusion coefficient of molecules inside and outside the cell (in cube A and cube B) is set to 1E-9, while the one in the cell membrane is set to 1E-12. Before estimation, we state some assumptions as follows.

a. The flux at the upper bound of cube A is considered to be a constant value.

b. The lower bound of cube B is considered an open border.

c. The left and right boundaries are regarded flux-free because they can be seen as canceling each other out.

d. The diffusion process is uniform and there will be no sudden increase in concentration.

e. The process of molecular diffusion is uniform and continuous

f. The process of molecules passing through the cell membrane is considered free diffusion.

COMSOL is used to simulate the diffusion process.

Figure 25: Diffusion

The process takes about 0.01s. The volume of the system is at the microliter level. As a result, we can believe that the total time of diffusion of inducers can be ignored.

Output

The processes involved are expressions of LuxI and LuxR, formation of a complex of LuxR's protein and AHL, and expression of GFP. To simplify our estimation without loss of rationality, we follow the assumptions stated in the Logic AND Gate part in predictions of productivity. Especially, rates of the expression processes involved are regarded as linear variables in our assumptions. We enter the mathematical derivation and values of parameters in the Logic AND Gate part into SimBiology and estimate the time taken for output. The graph SimBiology predicts and draws is as follows (Figure 26).

Figure 26: Diffusion

According to the value of the dependent variable in the graph, we estimate that the process takes almost 60 seconds.

Conclusion

By matching with wet lab, we know that the complete experiment takes several hours. Based on our estimation of time, we speculate that Register&Patch is the most time-consuming part. We hope that by extensive research and study, we can find an approximation algorithm with error multiple controlled at a constant level to appropriately predict the time and find ways of reducing wasted time to optimize our biological circuits.