Abstract

Figure 1: Abstract

Description

Our model consists of three parts.

The first is predictions of productivity and efficiency. On the one hand, we use the RBS-Calculator to ensure constant promoter functionality and explore the relationship between the distance from the promoter to the start codon and expression efficiency. This provides us with future research ideas for efficiency improvement. On the other hand, we do mathematical derivation based on our assumptions and build a control system model to simulate and calculate the numerical relationship between GFP output and various input quantities. Referring to the values of parameters from the 2019 iBowu-China Team, we verify the function of the logic AND gate for our project with Simbiology and Simulink.

The second is evidence of inhibition of CRISPRi. Based on the Hill coefficient and some parameter values from the 2013 UCSF Team's model, we build a model with simbiology. We successfully confirm its inhibitory effect on the productivity of GFP for wet lab.

The third is an estimation of time consumed by the boolean logic circuit system. With Comsol, we calculate the consuming time of inducers diffusion, whose value is small enough to be ignored. Moreover, we estimate that the process of Output takes about 60 seconds. So by matching with wet lab, we conclude that the process of Register&Patch takes much time and we hope we can find ways in future studies to reduce wasted time if an approximation algorithm with error multiple controlled at a constant level is found.

Predictions of Productivity and Efficiency

Register&Patch

Due to skillful circuit design and our various functions of the logic circuit, our wet lab conducts preliminary research and experiments to verify the feasibility of the circuit. Specifically, the maximum distance between the promoter and the start in our design codon reaches 1000 base pairs. Fortunately, our Lantern system works. Building on this success, we are investigating the productivity of our circuits and identifying factors that enhance expression efficiency. By matching with wet lab and referring to experiment results, we infer that the length between the promoter and the start codon is one of the important influencing factors (This hypothesis will be further elaborated in the following section.) Therefore, this part is divided into two sections:

a. Ensuring Promoter Functionality: We use the RBS-Calculator to verify that our constant promoters operate correctly in the absence of inducers.

b. Exploring the Promoter-Start Codon Distance: We aim to determine the relationship between the distance from the promoter to the start codon and the resulting expression efficiency.

RBS-Calculator

To ensure constant promoters in Register&Patch function properly and initiate significant transcription, we use the RBS-Calculator to predict here.

What is the RBS Calculator?

The Ribosome Binding Site (RBS) Calculator is an advanced algorithm designed to predict and regulate translation initiation and protein expression in bacteria over a wide range. Additionally, it can provide insights into transcription intensity.

In Predict mode, the RBS Calculator identifies the translation initiation rate for each start codon in an mRNA transcript.

In Design mode, it generates an optimized synthetic RBS sequence to achieve a targeted translation initiation rate for a given protein coding sequence. The RBS Calculator also allows for design constraints, such as adding restriction sites or a constant upstream sequence, to further customize the synthetic RBS sequence.

The application of the RBS calculator in our works

To predict the promoter, input the gene sequence into the promoter calculator and retrieve the resulting data graph (Figure 3):

Figure 2: Pathway diagram of the long-expression promoter

 

The outcome derived from the RBS calculator
Agarose gel electrophoresis of extended PCR fragment
Figure 3: TSS nucleotide position

Through the application of the RBS-Calculator, we can assess the transcriptional intensity of the sequence under standard conditions. Notably, two distinct convex peaks were observed, corresponding to positions with high expression levels, as predicted by both site data and the principles of the calculator. From these observations, we draw two key conclusions:

a. This element typically exhibits strong expression intensity, highlighting its critical role within the overall pathway.

b. When compared to the surrounding sites, the large disparity in intensity suggests that our design remains unaffected, aside from these two prominent expression peaks.

Length between the promoter and the start codon

We obtain the result from wetlab. Figure 4 shows the fluorescence intensity changing over time under four different conditions. If we assume that the distance between the inducible promoter and the start codon does not influence transcription, differences in the fluorescence curves under IPTG and Rha should primarily reflect the differences in the strengths of the two promoters. However, the data reveals a more significant discrepancy than expected. Thus, it is reasonable to hypothesize that the length between the promoter and the start codon does, have an impact on expression. To explore this further, we are conducting experiments to identify the influencing factors.

Agarose gel electrophoresis of extended PCR fragment
Figure 4: Fluorescence changing over time in four conditions

We are designing a series of plasmids, with varying promoter-start codon distances, ranging from short to long. This will allow us to collect enough data to generate a dataset that can be used to fit curves and develop a robust mathematical model describing the relationship between this distance and expression efficiency. Once this model is complete, we will be able to use it to predict the optimal distance range between the promoter and the start codon for our circuit design. The range will ensure that the recombinases function properly within our circuit while maintaining strong expression efficiency. With these insights, we will be able to optimize our Register&Patch section. Furthermore, the mature model can be applied to optimize similar circuits in future projects.

Logic AND Gate

To illustrate and certify the function of logic AND gate, we build a model to simulate the process of registers' action on output.

Assumptions

Without loss of generality and rationality, we state some assumptions as prerequisites.

a. The enzymes used in transcription and translation, which are not mentioned below, are regarded as sufficient.

b. Raw materials of AHL are adequate.

c. Amount of DNA is constant.

d. RNAs do not pair complementary.

e. Raw transcription rate of GFP is small enough to be ignored as zero, that is to say, we only calculate the transcription rate of GFP after GFP is activated.

f. We follow the control variable principle when getting results(See the results section for details).

g. Degradation coefficients and production coefficients of Protein and mRNA is considered as constant values.

Mathematical derivation

In this part, we show our mathematical derivation used in getting results.

We divide the biological process into three parts and make mathematical deductions:

a. Expression of LuxR.

b. Generation of AHL.

c. Generation of the complex and production of GFP

a. Expression of LuxR

The process of expression consists of transcription and translation.

We list the required notations and their explanations as follows.

Notation Explanation
[LR_d] Amount of LuxR's DNA
[LR_m] Amount of LuxR's mRNA
[LR_p] Amount of LuxR's protein
tcR Transcription coefficient of LuxR
tlR Translation coefficient of LuxR
dR_m Degradation coefficient of LuxR's mRNA
dR_p Degradation coefficient of LuxR's protein

Transcription and translation:

Agarose gel electrophoresis of extended PCR fragment
Figure 5: Expression of LuxR

 

$$ \begin{equation} \frac{d[LR_m]}{dt} = tcR*[LR_d]-[dR_m]*[LR_m] \tag{1} \end{equation} $$
$$ \frac{d[LR_p]}{dt}=tlR*[LR_m]-[dR_p]*[LR_p] \tag{2} $$
b. Generation of AHL

The process consists of two parts, expression of LuxI and enzymatic reaction. When LuxI is produced, it as an enzyme reacts with raw materials to produce AHL. We list the required notations and their explanations as follows.

Notation Explanation
[LI_d] Amount of LuxI's DNA
[LI_m] Amount of LuxI's mRNA
[LI_p] Amount of LuxI's protein
[AHL] Amount of AHL
[raw] Amount of raw materials that produce AHL
tcI Transcription coefficient of LuxI
tLI Translation coefficient of LuxI
dI_m Degradation coefficient of LuxI's mRNA
dI_p Degradation coefficient of LuxI's protein
d_A Degradation coefficient of AHL
c_1 Combination coefficient of LuxI's protein and AHL
k_A Production coefficient of AHL

The expression of LuxI is similar to LuxR, so we directly obtain the following differential equation.

$$ \begin{equation} \frac{d[LI_m]}{dt} = tcI*[LI_d]-[dI_m]*[LI_m] \tag{3} \end{equation} $$
$$ \frac{d[LI_p]}{dt}=tLI*[LI_m]-[dI_p]*[LI_p] \tag{4} $$

To characterize the enzymatic reaction, we use the Michaelis-Menten equation. The production rate of AHL is as follows.

$$ \frac{d[AHL]}{dt}=\frac{k_A*c_1*[LI_p]*[raw]}{1+c_1*[raw]}-d_A*[AHL] \tag{5} $$

The amount of raw materials is adequate. So to simplify the mathematical derivation, the first term on the right side of the formula can be changed as follows:

$$ \lim_{[raw] \to +\infty}\frac{k_A*c_1*[LI_p]*[raw]}{1+c_1*[raw]}=k_A*[LI_p] $$

So we change equation(5) into equation(6):

$$ \frac{d[AHL]}{dt}=k_A*[LI_p]-d_A*[AHL] \tag{6} $$
c. Generation of the complex and production of GFP

This is the main part of our logic AND gate. With both LuxR's protein and AHL, The logic gate opens and exports GFP's protein. The proteins emit fluorescence and represent output. From a quantitative perspective, LuxR's protein and AHL combine one by one, and then combine with the GFP gene in a ratio of two to one.

Agarose gel electrophoresis of extended PCR fragment
Figure 6: LuxR and AHL induce the production of GFP

 

We list the required notations and their explanations as follows.

Notation Explanation
[AHL] Amount of AHL
[LR_p] Amount of LuxR's protein
[GFP_d] Amount of GFP's DNA
[GFP_m] Amount of GFP's mRNA
[GFP_p] Amount of GFP
tcP Transcription coefficient of GFP
tlP Translation coefficient of GFP
dP_m Degradation coefficient of GFP's mRNA
dP_p Degradation coefficient of GFP
c_2 Combination coefficient of LuxI's protein, AHL and GFP

To mathematically derive the generation of the LuxR-AHL-GFP complex, we use Hill equation. Notice that, the the Hill coefficient is 2. Without losing generality and rationality and to simplify the calculation process, we merge the two processes: generation of the complex and production of GFP with two differential equations.

$$ \frac{d[GFP_m]}{dt}=\frac{tcP*[GFP_d]*(c_2*[LI_p]*[AHL])^2}{1+(c_2*[LI_p]*[AHL])^2}-dP_m*[GFP_m] \tag{7} $$
$$ \frac{d[GFP_p]}{dt}=tlP*[GFP_m]-dP_p*[GFP_p] \tag{8} $$
Calculation for simulation

To simulate the process and verify the role of logic AND gates in biological circuits, we try to build a control system to test the relationship between AHL&LuxR and GFP. By mathematical derivation, under the assumption of sufficient raw materials, the amount of AHL and LuxI's protein show a linear relationship. To simplify the input setting, the amount of LuxR's DNA and LuxI's DNA are considered as input substances and GFP as our output substance in the control system. Based on the previous assumptions, we use dominant biological reactions as conditions for the control system model. By applying control theory, we use the mathematical derivation to find the following transfer function.

From equation(1) (2), we get the transfer function from the amount of LuxR's DNA to the amount of LuxR's protein. Let [LR_0] be the the amount of produced LuxR's protein during expression. For other notations, we follow the above regulations.

Agarose gel electrophoresis of extended PCR fragment
Figure 7: LuxR input

 

$$ G=\frac{C(s)}{R(s)}=\frac{[LR_0]}{[LR_d]}=\frac{tcR*tlR}{(s+dR_m)(s+dR_p)} \tag{9} $$

Considering the generation of the complex in the control system, we get the amount of free LuxR's protein as follows. Let [C] be the amount of complex of LuxR, AHL and GFP's DNA and [LR_p] be the amount of free LuxR's protein.

Agarose gel electrophoresis of extended PCR fragment
Figure 8: LuxR in the control system

 

$$ [LR_p]=[LR_0]-2[C] \tag{10} $$

From equation (3) (4) (6) and using the same method, we get the transfer function from the amount of LuxI's DNA to the amount of AHL. Also, let [AHL_0] be the amount of produced AHL without considering consumption and [AHL] be the amount of free AHL.

Figure 9: LuxI input

 

$$ \frac{[AHL_0]}{[LI_d]}=\frac{k_A*tcI*tlI}{(s+dI_m)(s+dRI_p)} \tag{11} $$
Figure 10: LuxI input

 

$$ [AHL]=[AHL_0]-2[C] \tag{12} $$

From equation (7), we can easily get the transfer function (13) of GFP's mRNA as follows.

Agarose gel electrophoresis of extended PCR fragment
Figure 11: LuxR and AHL induce the production of GFP

 

$$ \frac{[GFP_m]}{[GFP_d]}=\frac{tcP*(c_2*[LI_p]*[AHL])^2}{1+(c_2*[LI_p]*[AHL])^2}*\frac{1}{s+dP_m} \tag{13} $$
Agarose gel electrophoresis of extended PCR fragment
Figure 12: GFP output

 

$$ \frac{[GFP_p]}{[GFP_m]}=\frac{tlP}{s+dP_P} \tag{14} $$

Referring to the data of the 2019 iBowu-China Team, we obtain values of parameters and make adjustments based on the specific conditions of our wet lab to fit the result. Substituting parameter values into the model, we get the results.

Results

To calculate the productivity of GFP's protein and predict the efficiency of logic AND gate and output part, we discuss several influential factors.

GFP-DNA

We use Simbiology to build the biological model and enter the above differential equation. We get some naive parameters and data from the 2019 iBowu-China Team and set roughly estimated initial quantities from our wet lab. By changing the amount of GFP's DNA, We test GFP expression production over time under different DNA conditions. We can directly find the relationship from Figure 13 and predict the productivity of GFP in terms of different amounts of DNA.

Agarose gel electrophoresis of extended PCR fragment
Figure 13: Relationship between GFP and its DNA

Here we control quantities of other substances as sufficient invariants, and let GFP's DNA be the the independent variable and GFP's protein be dependent variable. Moreover, amounts of those two in magnitude fit the experimental real value. Apparently, within the preset interval, the production of protein increases monotonically with the amount of DNA increasing as expected.

GFP-register&patch

Firstly we use Simbiology to simulate the productivity of GFP with LuxR and AHL. We set some naive parameters and regular initial values, then get the image of their production over time. From Figure 14, it's apparent that GFP begins to produce protein after LuxR and AHL amount to a certain level.

Agarose gel electrophoresis of extended PCR fragment
Figure 14: Productivity of three substances over time

 

Then Simulink is used in this part to simulate.

Agarose gel electrophoresis of extended PCR fragment
Figure 15: The control system

In the control system, we assume that the input amount of LuxI and LuxR represent the amount of two genes activated in the Register&Patch. By transfer functions (9) and (11), we know that the concentration of AHL and LuxI's DNA shows a linear positive correlation and so does the concentration of LuxR's protein and LuxR's DNA. That is to say, we establish a complete relationship chain from LuxI's DNA and LuxR's DNA to output GFP.

Changing the input amount of LuxI and LuxR, we measure the output amount of GFP's protein and fit it in a 3-dimensional graph (Figure 16). Fortunately, with biological principles and control theory, our simulation result is consistent with our design purpose. When over one input amount of LuxI and LuxR is set to zero, the GFP output amount will be zero. From a qualitative perspective, there is an output quantity only when both input quantities exist. From a quantitative perspective, GFP and LuxI&LuxR are positively related. Note that when both inputs increase, their output productivity is significantly higher. That verifies the function of our logic AND gate in the circuit and predicts the productivity of GFP.

Agarose gel electrophoresis of extended PCR fragment
Figure 16: Relationship between GFP and LuxR&AHL

 

Evidence of Inhibition of CRISPRi

Based on the mechanism of CRISPR interference, we establish reaction equations to demonstrate the inhibitory effect of dcas9 and sgRNA on the expression of GFP.

Agarose gel electrophoresis of extended PCR fragment
Figure 17: CRISPR interference

Assumptions

Without loss of generality and rationality, we state some assumptions as prerequisites.

a. The enzymes used in transcription and translation are regarded as sufficient.

b. The amount of Materials needed here is sufficient.

c. Amount of DNA is constant.

d. Degradation coefficients and production coefficients of Protein and mRNA are considered as constant values.

e. Repression is controlled by a Hill function and depends on the concentration of dCas9 and sgRNA complex.

Mathematical derivation

To simplify our project, we only calculate sgRNA1 here, abbreviated as sgRNA in the following derivation. We list the required notations and their explanations as follows.

Notation Explanation
[s_d] Amount of sgRNA's DNA
[s_r] Amount of sgRNA
[dc_d] Amount of dcas9's DNA
[dc_m] Amount of dcas9's mRNA
[dc_p] Amount of dcas9
ts Transcription coefficient of sgRNA
td Transcription coefficient of dcas9
ld Translation coefficient of dcas9
ds Degradation coefficient of sgRNA
dd_1 Degradation coefficient of dcas9's mRNA
dd_2 Degradation coefficient of dcas9

Firstly we use ODE to show the expression process of sgRNA and dcas9.

$$ \frac{d[s_r]}{dt}=ts*[s_d]-ds*[s_r] \tag{15} $$
$$ \frac{d[dc_m]}{dt}=td*[dc_d]-dd_1*[dc_m] \tag{16} $$
$$ \frac{d[dc_p]}{dt}=ld*[dc_m]-dd_2*[dc_p] \tag{17} $$

dcas9 and sgRNA combine and the complex binds to the promoter to exert a repressive effect. We set up differential equations to calculate the expression amount of GFP. We list the required notations and their explanations as follows.

Notation Explanation
[com] Amount of complex (dcas9 & sgRNA)
[P_d] Amount of GFP's DNA
[P_m] Amount of GFP's mRNA
[P_p] Amount of GFP
k Combination coefficient of complex
tp Transcription coefficient of GFP
lp Translation coefficient of GFP
dc Dissociation coefficient of complex
dp_1 Degradation coefficient of GFP's mRNA
dp_2 Degradation coefficient of GFP

sgRNA and dcas9 combine and we calculate the amount of the complex changing over time as follows.

$$ \frac{d[com]}{dt}=k*[s_r]*[dc_p]-dc*[com] \tag{18} $$

To simplify our model without losing rationality and simulate the process of inhibition, we use the Hill equation, and then we set up ODE as followed. We set h as the Hill coefficient and obtain its value and values of certain parameters from the 2013 UCSF Team. The expression rate of GFP under combination with the complex is as follows.

$$ \frac{d[P_m]}{dt}=[P_d]*tp*(1-\frac{[com]^h}{dc^n+[com]^h})-dp_1*[P_m] \tag{19} $$
$$ \frac{d[P_p]}{dt}=lp*[P_m]-dp_2*[P_p] \tag{20} $$

Results

We use simbiology to solve ODE and simulate the production of GFP. We draw change curves of GFP in two conditions: with and without expression of dcas9&sgRNA as follows (Figure 18&19).

Agarose gel electrophoresis of extended PCR fragment
Figure 18: Productivity without CRISPRi

 

Agarose gel electrophoresis of extended PCR fragment
Figure 19: Productivity with CRISPRi

 

With an expression of dcas9&sgRNA, it's obvious that the amount of GFP significantly reduces and then amounts of three substances reach equilibrium. Moreover, to confirm the generality of the inhibitory effect, the concentration of GFP's DNA is changed and we calculate the the concentration of GFP under CRISPRi. As we thought, concentration of GFP decreases despite how much DNA we enter(Figure 20). We conclude that the combination of dcas9 and sgRNA surely inhibits the expression of GFP, that is to say, CRISPRi acts as our expectation in this part and implements the function of NOT gate in our circuit.

Agarose gel electrophoresis of extended PCR fragment
Figure 20: Relationship between GFP and its DNA with CRISPRi

 

Estimation of Time

We try to estimate the time consumed by the entire biological logic circuit to match the result with wet lab and want to propose suggestions for reducing wasted time. So we build models for two processes, which are a diffusion of molecules and Logic AND Gate. About Register&Patch, There are too many factors that cannot be ignored. So simplified mathematical model makes it hard to calculate time accurately, so we need extensive research and study to explore this complex part in the future.

Diffusion of inducers

We simply calculate how long it would take for the inducers to penetrate the cell membrane and enter the cell. We intercept the cell membrane per unit area(1 square micron). Take a square inside and outside the cell membrane, each with a volume of one cubic micron, named cube A and cube B. Without loss of generality, we take the thickness of the cell membrane to be seven nanometers. Our estimation model is built based on the unit geometry. By calculating the diffusion time in cubes, we can roughly estimate the total time by linear superposition. By matching with wet lab, we obtain an average concentration of inducers, which is about 6E-7 mol/L. The diffusion coefficient of molecules inside and outside the cell (in cube A and cube B) is set to 1E-9, while the one in the cell membrane is set to 1E-12. Before estimation, we state some assumptions as follows.

a. The flux at the upper bound of cube A is considered to be a constant value.

b. The lower bound of cube B is considered an open border.

c. The left and right boundaries are regarded flux-free because they can be seen as canceling each other out.

d. The diffusion process is uniform and there will be no sudden increase in concentration.

e. The process of molecular diffusion is uniform and continuous

f. The process of molecules passing through the cell membrane is considered free diffusion.

Comsol is used to simulate the diffusion process.

Agarose gel electrophoresis of extended PCR fragment
Figure 21: diffusion of inducers

 

The process takes about 0.01s. The volume of the system is at the microliter level. So we roughly calculate that the total time of diffusion of inducers can be ignored.

Output

The processes involved are expressions of LuxI and LuxR, formation of a complex of LuxR's protein and AHL, and expression of GFP. To simplify our estimation without loss of rationality, we follow the assumptions stated in the Logic AND Gate part in predictions of productivity. Especially, rates of the expression processes involved are regarded as linear variables in our assumptions. We enter the mathematical derivation and values of parameters in the Logic AND Gate part into Simbiology and estimate the time taken for output. The graph Simbiology predicts and draws is as follows (Figure 22).

Agarose gel electrophoresis of extended PCR fragment
Figure 22: Time estimation of output

According to the value of the dependent variable in the graph, we estimate that the process takes almost 60 seconds.

Conclusion

By matching with wet lab, we know that the complete experiment takes several hours. Based on our estimation of time, we speculate that Register&Patch is the most time-consuming part. We hope that by extensive research and study, we can find an approximation algorithm with error multiple controlled at a constant level to appropriately predict the time and find ways of reducing wasted time to optimize our biological circuits.