Introduction

Our project, which designs MOVE as a fundamental improvement to RNA pesticides, has raised several questions that are worth investigating through in-silico methods. Due to the vast scope of our project, we needed to allocate lab time and material budgets to only the most critical experiments. Modeling has been essential in determining which experiments to prioritize. By running computer simulations, we can accelerate the engineering process without incurring additional financial costs. Furthermore, we aim to address concerns raised by Wet lab experiments and propose potential solutions. Our technology has a wide range of applications. Additionally, as long as surface displaied proteins are properly designed, any desired protein can be displayed. To ensure the widespread adoption of MOVE technology in society, it is essential to evaluate and promote its modularity, effectiveness, and productivity. In the Dry Lab, we focus on how modeling can significantly contribute to visualizing and improving these aspects. We are also paying attention to modularizing the model itself, making it easily accessible for future MOVE users.

Figure 1. Role of the model in our project
Figure 1. Role of the model in our project

Models Overview

RNA-mMV Model

In this model, “the production process of MOVE in E. coli”, “the retention process of MOVE in farm field environments”, and “the action of shRNA within the target organism” are primarily described using ordinary differential equations (ODEs). This model demonstrates the feasibility of MOVE production prior to Wet Lab experiments and proposes improvements to the medium conditions for more reliable production. Additionally, it evaluates the effectiveness of MOVE as a pesticide, visualizing its modularity by demonstrating its efficacy across various species.

Figure 2. Overview of RNA-mMV Model
Figure 2. Overview of RNA-mMV Model

Surface Presentation Model

In this model, we performed structural prediction and molecular docking of surface-displayed proteins to confirm that the surface-displayed proteins can function on the E. coli surface without being affected by the SpyCatcher/SpyTag conjugation. This simulation process is also available to MOVE users who wish to display other enzymes on MOVE.

Figure 3. Overview of Surface Presentation Model
Figure 3. Overview of Surface Presentation Model

Continuous Cascade Bioreactor Model

In this model, we performed Metabolic Control Analysis (MCA) and a parameter scan to optimize the system when cultivating MOVE in a continuous cascade bioreactor. MOVE is a highly versatile technology that can be applied in various fields, and optimizing its production is essential. This model demonstrates the potential for efficient, low-cost, large-scale production of MOVE, which is necessary for its societal implementation. The results obtained here will be used in the economic model to demonstrate the potential for business success.

Figure 4. Overview of Continuous Cascade Bioreactor Model
Figure 4. Overview of Continuous Cascade Bioreactor Model

Abstract

We divided the process from MOVE creation to its use as a pesticide and the effects of RNAi into six blocks and developed models for each section to run simulations. Each model is primarily described using ODEs (Ordinary Differential Equations). Particularly in the m-MV production model, we addressed the concern of whether shRNA would be encapsulated in m-MVs, predicted its feasibility, and provided suggestions for protocol improvements. Furthermore, we evaluated the effectiveness of using MOVE as a pesticide and determined the appropriate application amount for effective use. These results will be utilized in the economic model and serve as a basis for entrepreneurship business success.

Introduction

Due to the vast scope of our project, we needed to allocate laboratory time and material budgets exclusively to the most critical experiments. Modeling helps in determining which experiments those are. By running computer simulations, we can accelerate the engineering process without incurring additional financial costs. In this model, we simulate whether the transformed E. coli for MOVE production can successfully produce MOVE, and whether the produced MOVE can effectively exhibit its therapeutic effects. We conducted the simulations following the flow shown in the diagram below. In particular, whether a sufficient amount of shRNA is encapsulated within the multi layered membrane vesicle(m-MV) can be estimated by the model, and this was a crucial factor for the completion of MOVE. Additionally, this model aims to create a modular model that can be applied to multiple organisms by simulating the effects of MOVE in several types of organisms.

Figure 1: workflow of m-MV Model
Figure 1: workflow of m-MV Model

shRNA Production

Purpose

shRNA is the component that demonstrates the pesticide effect of MOVE and plays the leading role in this grand system. Therefore, modeling the production concentration of shRNA is crucial in evaluating the overall design and feasibility of MOVE. However, since the wet lab experimental data does not provide sufficient information regarding the parameters, we ran simulations and applied them to the shRNA encapsulation model of the m-MV.

Methods and Modeling

For the production of shRNA, we used the pBluescript vector with the pUC origin of replication, and employed the T7 promoter for transcription. When RNA polymerase (RNAP) binds to the T7 promoter on pBluescript, transcription is initiated, first producing single-stranded RNA (ssRNA) before it adopts a secondary structure. In this study, we utilized a simplified model of this process. In this model, we assume that transcription is switched on and off based on the binding and dissociation of the T7 promoter and RNAP. Furthermore, the produced ssRNA, due to the properties of its nucleotide sequence, adopts a secondary structure after a certain period of time, transforming into short hairpin RNA (shRNA). Based on these assumptions, we can formulate the following differential equations.

d[ssRNA]dt=Tr~[Promoter][RNAP]Km+[RNAP]kdeg[ssRNA](1)\frac{d[\text{ssRNA}]}{dt} = \tilde{Tr}\frac{[\text{Promoter}][\text{RNAP}]}{K_m+[\text{RNAP}]} - k_{deg}[\text{ssRNA}] \qquad(1)

This part of the equation ( equation(1)) describes the process leading up to the production of ssRNA within E. coli. The transcription of ssRNA depends on the concentrations of the promoter and RNAP, and is determined by the transcription rate constant (TrTr). Additionally, ssRNA can be degraded by RNase, and this degradation is governed by the degradation rate (KdegK_{deg}).

d[shRNA]dt=kfold[ssRNA]kdeg2[shRNA](2)\frac{d[\text{shRNA}]}{dt} = k_{fold}[\text{ssRNA}]- k_{deg2}[\text{shRNA}] \qquad (2)

This equation (ezuation(2)) represents the production concentration of shRNA. It depends on the folding rate of ssRNA and the degradation rate of shRNA, expressed as (kfoldk_{fold}) and (kdeg2k_{deg2}), respectively.

Here, Tr~\tilde{Tr} and KmK_m are calculated as follows. KmK_m is the Michaelis constant, which is derived from the dissociation rate constant of RNAP (koffk_{off}) and the binding rate constant of RNAP (konk_{on}) based on the assumption that Km=koff/konK_m = k_{off}/k_{on}. However, the konk_{on} value provided in Kim’s paper is obtained in a cell-free system, and it is influenced by differences in diffusion coefficients within a bacterial cell[2] . To account for it, we used the correction factor ε, which represents the ratio of diffusion coefficients between the cell-free system and the intracellular environment. The ssRNA production rate constant Tr~\tilde{Tr} is calculated by dividing the transcription rate by the number of nucleotides in the ssRNA, giving the number of ssRNA molecules produced per second.

Km=koff/εkon(3)K_{m} = k_{off}/εk_{on} \qquad (3) Tr~=Tr(basecount)(4)\tilde{Tr} = \frac{\text{Tr}}{(base count)} \qquad (4)
Table1 Parameters of the shRNA Production Model
parameterDescriptionvalueunitsource
TrTranscription rate constant43nt*s^-1[1]
konk_{on}Binding rate constant of RNAP1.6e6M^-1*s^-1[2]
koffk_{off}Divergence rate constant for RNAP2.9s^-1[1]
ϵ\epsilonDivergence rate constant for RNAP2.9s^-1[2]
kdegk_{deg}ssRNA degradation constant0.0013s^-1[3]
kfoldk_{fold}ssRNA folding constants0.5s^-1[4]
kdeg2k_{deg2}Degradation constants for shRNA0.0s^-1
(basecount)(base {count})Number of bases of shRNA81ntdepending on the experiment
[Promoter]0[Promoter]_{0}Promoter concentration in E. coli3.3e-7Mestimated from ori
[RNAP]0[RNAP]_{0}RNAP concentration in E. coli1e-6M[5]

※ The kdeg2k_{deg2} is set to 0 because One Shot™ BL21 Star™ (DE3) Chemically Competent E. coli was used as the chassis, which has enhanced resistance to shRNA degradation and mRNA stability.

Result

Figure 2. Concentration of ssRNA (represented by blue line) and shRNA (represented by yellow line) in E. coli
Figure 2. Concentration of ssRNA (represented by blue line) and shRNA (represented by yellow line) in E. coli
Figure 2. Concentration of ssRNA (represented by blue line) and shRNA (represented by yellow line) in E. coli

As shown in Figure 2, single-stranded RNA reaches a steady state shortly after transcription initiation, and the subsequent concentration change of shRNA within E. coli proceeds linearly. From this, it can be concluded that ssRNA is rapidly converted into shRNA within E. coli, and the concentration of shRNA changes at an approximate rate of 1.21.2 µM per hour. In terms of the number of molecules, shRNA is produced at a rate of 7.0×1027.0 × 10^2 molecules per hour per E. coli cell.

m-MV production

Purpose

m-MV serves as a protective membrane to enhance the durability of shRNA and is a fundamental part of our project. Therefore, modeling the production of m-MV is crucial in evaluating the overall design and feasibility of MOVE. However, due to time and cost constraints, the wet lab experimental data does not provide sufficient information, so we conducted simulations and applied them to the shRNA encapsulation model of the m-MV.

Methods and Modeling

This model mathematically describes the production of poly(3-hydroxybutyrate)(PHB) in genetically engineered E. coli strains and simulates the production of m-MV. The model is based on PHB production in an E. coli strain into which the PHA synthase gene (phaCRe\text{phaC}_{\text{Re}} from R. eutropha), the 3-ketothiolase gene (phaARe\text{phaA}_{\text{Re}} from R. eutropha), and the acetoacetyl-CoA reductase gene (phaBRe\text{phaB}_{\text{Re}} from R. eutropha) have been introduced through genetic engineering. The model is constructed using the results of flux analysis related to PHB production and biomass formation in Cupriavidus necator DSM 545[7] , the original source of these genes, and it comprehensively analyzes the sequence of processes leading to MV formation due to the internal pressure of PHB.

Preliminary analysis of experimental data

Estimation of PHB Production

Wet lab data will be analyzed to identify trends and special features that should be considered in the model. This step is based on the analysis of Markan Lopar’s paper to obtain the data needed for the analysis in Dry and to derive the relationship between media conditions and PHB and m-MV production.

In Markan Lopar’s paper, four different medium conditions are considered, and we will also conduct our analysis based on these conditions. These medium conditions mainly vary in the amount of nitrogen. The glucose consumption, nitrogen consumption, biomass production, and PHB production for each of the four medium conditions can be examined using the differential equations derived from the metabolic flux.

dGextdt=r1MwGBIOρ(5)\frac{dG_{\text{ext}}}{dt} = -r_1 \frac{MwG \cdot \text{BIO}}{\rho} \qquad (5) dNH3,extdt=r41MwNH3BIOρ(6)\frac{d\text{NH}_{3,ext}}{dt} = -r_{41} \frac{MwNH_3 \cdot \text{BIO}}{\rho} \qquad (6) dBIOdt=r40MwBIOBIOρ=μBIO(7)\frac{dBIO}{dt} = r_{40} \frac{MwBIO \cdot \text{BIO}}{\rho} = \mu \cdot \text{BIO} \qquad (7) dPHBextdt=(r37r38)MwPHBBIOρ(8)\frac{dPHB_{\text{ext}}}{dt} = (r_{37} - r_{38}) \frac{MwPHB \cdot \text{BIO}}{\rho} \qquad (8)

The following table shows the medium conditions and the corresponding parameters based on metabolic flux analysis. Using these values, we will simulate the steady-state PHB (wt%) concentration under each medium condition.

Gext is the external glucose concentration. MwG, MwNH_3, MwBIO, and MwPHB represent the molar molecular weights of each molecule. BIO represents the biomass in g/L.

Table2 parameters based on metabolic flux analysis[7]
parameterBatchR1R2R3
r_11111
r_370.19980.028040.79521
r_380000
r_400.55760.75430.12050
r_410.49180.66530.10630

Estimating m-MV production

The m-MV production concentration was estimated based on the results from Koh’s paper published in 2023[6] . In this paper, the property of GFP being encapsulated only in m-MVs was utilized to measure the production of m-MVs from the GFP intensity of the produced MVs.In this study, GFP intensity was measured, and what is known is the relative concentration in PHB (wt%), but the production concentration of m-MV is not clear. Since our prediction required the production rate of m-MV, we used the data from this previous research to calculate the absolute concentration of m-MV. The calculation method involved computing the concentration of m-MV from the number of m-MVs in Figure 3, using that as the absolute concentration, and then determining the m-MV production concentration relative to PHB (wt%) through third-order polynomial fitting.

Figure 3a. An illustrative representation intended to depict the occurance of GFP encupsulation. The data are shown as the mean ± standard deviation from three independent experiments[6] .
Figure 3b. Fluorescence microscopic image of purified MVs encapsulating GFP. Scale bars, 5 μm[6] .
Figure 3a. An illustrative representation intended to depict the occurance of GFP encupsulation. The data are shown as the mean ± standard deviation from three independent experiments[6]. Figure 3b. Fluorescence microscopic image of purified MVs encapsulating GFP. Scale bars, 5 μm[6].
Figure 3a. An illustrative representation intended to depict the occurance of GFP encupsulation. The data are shown as the mean ± standard deviation from three independent experiments[6]. Figure 3b. Fluorescence microscopic image of purified MVs encapsulating GFP. Scale bars, 5 μm[6].

Result

Estimation of PHB Production

Table3 Simulation results for steady-state values of PHB(wt%) concentration
ReacterR1R2R3
PHB(wt%)25%3.3%86%

As a result of calculations for each medium condition, the steady-state PHB (wt%) concentration was found to be as shown in Table 3. Since m-MV production does not occur unless the PHB concentration is at least 60 (wt%), it was revealed that medium conditions such as R2 are required for m-MV production. This indicates that the transformant transitions between a state where growth is the primary focus and a state where PHB and m-MV production are predominant, depending on the nitrogen levels in the medium. This is highly valuable information for the efficient production of m-MV and is a significant outcome of the in-silico analysis.

Estimating m-MV production

Figure 4. Polynomial fitting results for m-MV production
Figure 4. Polynomial fitting results for m-MV production

Based on Koh’s paper, a third-order polynomial fitting of m-MV production yielded the following equation. x represents the 10210^{-2}wt% concentration of PHB. This will be used primarily in the creation of the Bioreacter model. The following equation is the m-MV production per 24h of 1gDWC with PHB(10210^{-2}wt%) as a variable.

1.67537594×1010+(1.65327068×109)x(4.41954887×109)x2+(3.63157895×109)x31.67537594 \times 10^{-10} + (1.65327068 \times 10^{-9})·x - (4.41954887 \times 10^{-9})·x^2 + (3.63157895 \times 10^{-9})·x^3

shRNA internalizasion

Purpose

In our project, we aim to create MOVE, which consists of m-MVs containing shRNA, using recombinant E. coli that simultaneously produces shRNA and PHB. To achieve this, it was necessary to quantitatively understand the conditions under which shRNA is incorporated into m-MVs, ensuring that at least one shRNA molecule is encapsulated within each m-MV. For more details on the mechanism of MOVE production, please refer to the Wet Design . While shRNA and m-MV production are carried out in the same recombinant E. coli, if the shRNA production is insufficient and the intracellular shRNA concentration is low, there is a possibility that shRNA will not be incorporated into the m-MVs. In such a case, empty m-MVs may be produced, which could reduce the pesticide effectiveness of MOVE.This modeling attempts to improve the project by determining how often shRNA encapsulation within m-MVs occurs under various conditions during MOVE production. To address the issue of potential shRNA deficiency, we evaluated how the concentration of shRNA within the transformants changes under different culture conditions and proposed a method to enhance the reliability of the system.

Methods and Modeling

From previous simulations, it was determined that each E. coli cell produces shRNA at a rate of 7.04×1027.04 × 10^2 molecules per hour, and the MV production rate is 3.8×1013.8 × 10^{-1} MVs per hour.

Figure 5. State of shRNA production and m-MV production in Transformed E. coli
Figure 5. State of shRNA production and m-MV production in Transformed E. coli
Figure 5. State of shRNA production and m-MV production in Transformed E. coli

As mentioned earlier, E. coli that produce PHB can exist in two states: one where biomass production is predominant and another where PHB and m-MV production are dominant. These states are influenced by the amount of nitrogen in the medium. Therefore, by controlling these states effectively, the number of MVs containing shRNA produced per hour per liter of medium can vary.First, we will verify how much shRNA is encapsulated in m-MVs based on the cultivation methods for m-MV production from previous studies. Let’s consider the recombinant E. coli strain with a doubling time of td and shRNA production per doubling time Ps. When E. coli divide, they distribute shRNA almost equally between the two daughter cells.

Figure 6. A hypothetical model of how shRNA levels in transformed E. coli change with doubling time. one E. coli produces 7.0×1027.0 \times 10^2 shRNAs per hour and divides into 2 individuals at doubling time. When splitting, the number of shRNAs possessed in the E. coli is considered to be about half of the number just before splitting. The figure focuses on the number of shRNAs possessed by only one individual, and the other individual at the time of division is abbreviated.
Figure 6. A hypothetical model of how shRNA levels in transformed E. coli change with doubling time.
one E. coli produces 7.0 *10^2 shRNAs per hour and divides into 2 individuals at doubling time. When splitting,
the number of shRNAs possessed in the E. coli is considered to be about half of the number just before
splitting. The figure focuses on the number of shRNAs possessed by only one individual, and the other
individual at the time of division is abbreviated.
Figure 6. A hypothetical model of how shRNA levels in transformed E. coli change with doubling time. one E. coli produces 7.0 *10^2 shRNAs per hour and divides into 2 individuals at doubling time. When splitting, the number of shRNAs possessed in the E. coli is considered to be about half of the number just before splitting. The figure focuses on the number of shRNAs possessed by only one individual, and the other individual at the time of division is abbreviated.

one E. coli produces 700 shRNAs per hour and divides into 2 individuals at doubling time. When splitting, the number of shRNAs possessed in the E. coli is considered to be about half of the number just before splitting. The figure focuses on the number of shRNAs possessed by only one individual, and the other individual at the time of division is abbreviated.In vitro studies on the variation in RNA production according to the dry biomass of E. coli are common, and many of these studies have simulated RNA based on its concentration. Many previous iGEM projects have modeled RNA production using the concentration of E. coli in this way, but we assume that the designed shRNA will be encapsulated into m-MVs. This means that the number of shRNA molecules we need to know is within a single E. coli cell. In this situation, it is more appropriate to reconsider the number of shRNA molecules inside one E. coli cell, rather than modeling shRNA based on its concentration.We constructed a model consisting of the doubling time-dependent number of shRNAs in E. coli with reference to a model that uses shRNA concentration to determine parameters. The equation for the number of shRNAs in E. coli is as follows.

ntdt<(n+1)tdandn1(9)n \cdot t_d \leq t < (n+1) \cdot t_d \quad \text{and} \quad n \geq 1 \qquad (9) k=1nPs2kCopies(t)k=1n+1Ps2k1(10)\sum_{k=1}^{n} \frac{P_s}{2^k} \leq \text{Copies}(t) \leq \sum_{k=1}^{n+1} \frac{P_s}{2^{k-1}} \qquad (10) Copies(t)=vs(tntd)+k=1nPs2k(11)\text{Copies}(t) = v_s \left( t - n \cdot t_d \right) + \sum_{k=1}^{n} \frac{P_s}{2^k} \qquad (11)

tdt_d is the doubling time, nn is the number of doublings, PsP_s is the number of shRNA molecules produced per doubling time (equal to vs×tdv_s × t_d), and vsv_s represents the number of shRNA molecules produced per unit of time. Copies(t)\text{Copies}(t) represents the number of shRNA molecules in a single E. coli cell at time tt.

The doubling time of E. coli varies depending on the nitrogen concentration in the reactor[7] . Under medium conditions with sufficient nitrogen, biomass production (i.e., cell growth) is prioritized over PHB production. However, when the nitrogen level decreases, the carbon source is primarily used for PHB production, and the rate of biomass production in E. coli matches the rate of biomass loss due to m-MV release, resulting in an apparent specific growth rate of μ=0μ = 0. At this point, the intracellular amount of shRNA, considering the release of shRNA through m-MVs, is calculated as follows.

dCopies(t)dt=vsCopies(t)×Vm-MVVE. coli×vm(12)\frac{d \, \text{Copies}(t)}{dt} = v_s - \text{Copies}(t) \times \frac{\text{V}_{\text{m-MV}}}{V_{\text{E. coli}}} \times v_m \qquad (12)

VmV_m represents the number of m-MVs produced per unit of time, and Vm-MV\text{V}_{\text{m-MV}} and VE. coli\text{V}_{\text{E. coli}} represent the volumes of m-MV and E. coli, respectively.In the prior research conducted by Professor Koh, the cultivation method for m-MV production involved inoculating LB medium (100 mL) and culturing for 48 hours. Since m-MV production requires cultivation under nitrogen limitation, it is assumed that nitrogen was depleted at some point during the 48-hour period, causing a shift to nitrogen-limited metabolic flux. The consumption of glucose, nitrogen, biomass production, and PHB production can be simulated again using the differential equations from the MV production concentration prediction model. When nitrogen reaches zero, it is regarded as a nitrogen-limited medium condition, and the metabolic flux parameters were adjusted accordingly.

The change in PHB(wt%) concentration at this point can be determined as follows.

PHB(wt%)=PHBPHB+BIO(13)\text{PHB(wt%)}=\frac{\text{PHB}}{\text{PHB}+\text{BIO}} \qquad (13)

the previous study’s culture conditions in our wet lab are as follows.

Table4 previous study’s culture conditions
Glucose concentration20g/L
Initial BIO amount1.3g
Initial nitrogen amount in LB medium2.3g
Culture time48 hours

Simulation

We ran the time evolution of intracellular shRNA numbers for each doubling time until a pseudo-steady state was reached. This was performed at specific growth rates (μ) of 1.0, 0.50, 0.20, and 0.10. The longer the doubling time, the higher the intracellular shRNA levels reach in the pseudo-steady state.

Figure 7. The relationship between doubling time and shRNA concentration
Figure 7. The relationship between doubling time and shRNA concentration

Figure 8. shRNA concentration under m-MV production conditions
Figure 8. shRNA concentration under m-MV production conditions

Table5 Number of intracellular shRNAs at each doubling time.
μ\mu1.00.50.200.10m-MV production conditions
Doubling time (h)0.6931.393.476.93inf
Number of shRNAs at steady state704~14081408~28162440~48804877~9754about 37000

As shown in graph 2, after approximately three doubling times, the intracellular shRNA amount reaches nearly its maximum. Table 5 demonstrates the degree of this. The figure simulates the intracellular shRNA levels in a non-growing, m-MV-producing state. Under the R2 medium condition, the biomass production and the biomass loss due to m-MV production are almost equal, leading to a state where the number of cells remains nearly constant. In this condition, the intracellular shRNA amount decreases only due to the release of m-MVs, but since this reduction is minimal relative to the total shRNA amount, it converges at a very high level.

Figure 9. m-MV shRNA number distribution in each state
Figure 9. m-MV shRNA number distribution in each state

Figure 9 simulates the distribution of shRNA encapsulated into m-MVs based on the actual number of shRNA molecules in each cell. The figure below shows the results. It was found that the resulting MOVE would be in the following states. We examined the cases where the number of shRNA molecules per cell was 1,000, 3,000, 5,000, 10,000, 20,000, and 40,000. When the intracellular shRNA amount is 5,000 or less, more than 8% of the m-MVs produced will be empty.

The simulation results for the changes in NH3NH_3, PHB, biomass, and PHB(wt%) under the cultivation conditions of previous studies are as follows.

Figure 10a. Changes in NH_3, PHB, and biomass under the culture conditions of the previous study. Figure 10b.PHB (wt%) changes under the culture conditions of the previous study.
Figure 10a. Changes in NH_3, PHB, and biomass under the culture conditions of the previous study. Figure 10b.PHB (wt%) changes under the culture conditions of the previous study.

After 18 hours of inoculation, nitrogen is depleted, and the system transitions primarily to PHB production. At this point, biomass production is almost balanced with m-MV production, and the growth of transformants halts. Around 30 hours after inoculation, the PHB (wt%) reaches 70%, and the production of m-MVs begins in earnest. The graph also shows that the PHB concentration shifts as nitrogen becomes limited. After switching to the PHB production flux, approximately 8,900 shRNA molecules are produced in the cells over a period of about 10 hours, with an average of about four shRNA molecules being encapsulated in each m-MV, and hardly any empty m-MVs being produced. Furthermore, to ensure even more shRNA is encapsulated in m-MVs, it has been found that simply adding glucose under nitrogen-limited conditions and continuing incubation for one or two additional nights will be effective.These results successfully address the concern raised in Wet lab experiments about the possibility of shRNA not being encapsulated in m-MVs. Additionally, we propose a solution for cases where shRNA is not encapsulated during experiments and offer suggestions for creating a more effective MOVE.

Reslut

It was found that, using the cultivation method from previous studies, an average of four shRNA molecules can be encapsulated in each m-MV, with no empty m-MVs produced. MOVE can be recovered at a concentration detectable through experiments. Additionally, if sufficient encapsulation of shRNA is not confirmed in the experiments, it has been determined that simply adding glucose under nitrogen-limited conditions and continuing incubation for one or two additional nights will be effective. This guarantees the feasibility of MOVE production in the Wet lab and has motivated us to continue the experimental work.

There are two points to note in this model

(1) The occurrence of empty m-MVs

The number of shRNA molecules encapsulated in m-MVs must be at least one or more. To reduce the number of empty m-MVs to nearly zero, the intracellular shRNA amount should be increased to 10,000 or more.

(2) The number of m-MVs produced

The production of m-MVs in each E. coli cell depends on the PHB (wt%), which in turn depends on the nitrogen concentration in the medium. However, the PHB (wt%) reaches its maximum at around 80%, meaning that further increases in m-MV production at the cellular level are not possible. To solve this, the only option is to increase the number of transformants. However, E. coli with high PHB (wt%) have a low specific growth rate and are not suitable for growth. We have considered solving this issue using a continuous cascade system.

The bioreactor design model is explained on the page for the continuous bioreactor model.

MOVE durability

Purpose

We will simulate how long MOVE can maintain the protective state of shRNA in outdoor environments. This is crucial to demonstrate that MOVE protects shRNA and enhances the effectiveness of the pesticide. We will conduct supporting experiments in the Wet lab and fit the obtained results using polynomial fitting. These results will then be formulated into ordinary differential equations and connected to the m-MV encapsulation model, among others. The following are assumed for MOVE in the external environment and for internal shRNA loss.

  1. MOVE decays over time (UV, heat, pH, ions)

  2. MOVE is washed away from plant leaves and stems by rain runoff

  3. UV light destroys shRNAs inside MOVE

We aim to improve upon issue 2 by using Mgfp-5, a mussel adhesive protein, as a surface display protein.

Methods and Modeling

We will fit the experimental results using polynomial fitting.

Result

In this case, the experiments we had planned in the Wet lab were not completed in time for the wiki freeze.

MOVE uptake

Purpose

We will measure the uptake efficiency of MOVE by the target pathogens in relation to MOVE concentration through experiments and use the results to formulate differential equations. However, due to time constraints, we were unable to conduct this experiment in the Wet lab. Below, we outline the simulation method to be used once experimental data is obtained.

Methods and Modeling

Figure 11. Schematic diagram of the MOVE uptake model
Figure 11. Schematic diagram of the MOVE uptake model

Qin=μc×StargetCMOVEVtarget(15)Q_{in}= \mu \frac{c\times S_{target}C_{MOVE}}{V_{target}} \qquad (15) CMOVE=NMOVEα×Sland(16)C_{MOVE} = \frac{N_{MOVE}}{\alpha \times S_{land} } \qquad (16)

Let S_target represent the surface area of the target pathogen [m²], C_MOVE the density of MOVE [mol/m²], and c the number of shRNA molecules encapsulated in a single MOVE. V_target is the volume of the target pathogen [L]. N_MOVE represents the number of moles of MOVE dispersed, and S_land is the area of the farmland [m²]. Alpha (α) is a coefficient representing the surface area of the farmland, used to calculate the surface area including the leaves and stems of crops. This coefficient depends on the structure of the crops and farmland. Mu (μ) represents the uptake rate by the pathogen and ranges from 0 to 1.

RNAi effect

Purpose

This model mathematically describes the process of GFP (Green Fluorescent Protein) expression suppression through the RNAi (RNA interference) mechanism within yeast cells. The model comprehensively represents the entire process, starting from the introduction of shRNA (short hairpin RNA), the generation of siRNA (small interfering RNA), the formation of the RISC (RNA-induced silencing complex), and the degradation of GFP mRNA. Through this model, we aim to verify the RNAi effect of shRNA introduced into yeast cells via MOVE and demonstrate its therapeutic efficacy.

Methods and Modeling

Figure 12. Overview of RNAi
Figure 12. Overview of RNAi

To represent these processes, we constructed a system of coupled ordinary differential equations. Each equation describes the concentration changes of the respective molecular species as a function of time[8] .

Model equations

d[shRNA]/dt=kcut[Dicer][shRNA]ksh,deg[shRNA](17)d[\text{shRNA}]/dt = -k_{cut}[\text{Dicer}] [\text{shRNA}] - k_{sh,deg}[\text{shRNA}] \qquad (17) d[siRNA]/dt=kcut[Dicer][shRNA]ksi,deg[siRNA](18)d[\text{siRNA}]/dt = k_{cut}[\text{Dicer}] [\text{shRNA}] - k_{si,deg}[\text{siRNA}] \qquad (18) d[RISC]/dt=kRISC[siRNA](RISCtot[RISC])kr,deg[RISC](19)d[\text{RISC}]/dt = k_{\text{RISC}}[\text{siRNA}](\text{RISC}_{tot} - [\text{RISC}]) - k_{r,deg}[\text{RISC}] \qquad (19) d[mRNA]/dt=kmRNA[DNA]km,deg[mRNA]kcleav[RISC][mRNA](20)d[\text{mRNA}]/dt = k_{mRNA}[\text{DNA}] - k_{m,deg}[\text{mRNA}] - k_{cleav}[\text{RISC}][\text{mRNA}] \qquad (20) d[protein]/dt=kpro[mRNA]kp,deg[GFP](21)d[\text{protein}]/dt = k_{pro}[\text{mRNA}] - k_{p,deg}[\text{GFP}] \qquad (21)
Table6 Parameters of the RNAi effect
parameterDescriptionvalueunitsource
kcutk_{cut}Reaction rate constant for cleavage by Dicer4×10^-6nt*s^-1[9]
Dicerconc\text{Dicer}_{conc}Dicer concentration20M^-1*s^-1assumed value
kdegshrnak_{deg_{shrna}}Degradation rate constant for shRNA1/(32*3600)s^-1[10]
kdegsirnak_{deg_{sirna}}Degradation rate constant for siRNA1/(24*3600)s^-1[10]
kRISCk_{\text{RISC}}Rate of active RISC formation1.2×10^-4/3600s^-1[11]
krdegkr_{deg}RISC decomposition rate0.077/3600s^-1[11]
kmRNAk_{\text{mRNA}}Rate of mRNA production10/3600s^-1assumed value
km,degk_{m,deg}mRNA degradation rate0.69/3600nt[10]
kcleavk_{cleav}mRNA cleavage rate by RISC500/3600M[11]
kprok_{pro}protein production rate1/3600Massumed value
kdegk_{deg}Degradation rate of protein0.029/3600M[12]
RISCtot\text{RISC}_{tot}RISC concentration3Massumed value
DNA\text{DNA}Intracellular DNA concentration0.1Massumed value

Result

Figure 13. Model at shRNA uptake concentration of 0.5 nM
Figure 13. Model at shRNA uptake concentration of 0.5 nM
Figure 14. Protein inhibition rate by shRNA incorporation concentration
Figure 14. Protein inhibition rate by shRNA incorporation concentration

Figure 13 shows that the RNA interference mechanism is successfully modeled. Figure 14 demonstrates that, when the intracellular shRNA concentration reaches 100nM, up to 90% of protein expression is inhibited, indicating that with an appropriate gene target, it is possible to sufficiently eliminate the organism. In the case of Pyricularia oryzae (rice blast fungus), with alb1 as the target gene, alb1 is essential for the fungus to extend its hyphae into the plant. Since temporary inhibition of alb1 critically impacts the survival and reproduction of the fungus, it is expected that RNAi will have a significant fungicidal effect.

MOVE application volume calculation

We consider the number of MOVE particles to be spread per 1ha. We mainly use the results from the MOVE uptake model and the RNAi effect model. According to the RNAi effect model, when the shRNA concentration in the pathogen reaches 100nM, protein expression is inhibited by 90% through RNA interference, which is expected to be sufficient for pathogen control. Using the MOVE uptake model, we found that to achieve an shRNA concentration of 100nM inside the pathogen, it is appropriate to spread 4.0×1074.0 \times 10^{-7} MOVE per 1ha. However, this value is based on α=1\alpha = 1 and μ=1\mu = 1. The range of these values is still uncertain as we have not yet been able to experimentally calculate the uptake efficiency of MOVE, but we estimate that it would not be unreasonable to assume it falls within a range of 1 to 10 times this value. We are using this estimated value to calculate the economic effect in our Economic model and to link it to Entrepreneurship .

Summary

In this section, we were able to demonstrate the feasibility of MOVE production at a laboratory scale and prove the e ffectiveness of RNAi interference as a pesticide. Before the experiment, we showed that MOVE could be produced in the Wet lab and proposed improvements to the protocol. Furthermore, by simulating the effectiveness of MOVE at an outdoor scale, which would be difficult to achieve experimentally in the Wet lab, and by calculating the amount of MOVE to be dispersed, we provided data as a basis for feasibility in the economic model and entrepreneurship business model.

Modularization

This is the file of the model we used. This makes it easy to reproduce the simulation we have done here.Click here .

References

[1] Skinner, G. M. et al.,Promoter Binding, Initiation, and Elongation By Bacteriophage T7 RNA Polymerase: A SINGLE-MOLECULE VIEW OF THE TRANSCRIPTION CYCLE . Journal of Biological Chemistry 279(5) , 3239-3244 (2004).

[2] Ji Hoon Kim, Ronald G. Larson,Single-molecule analysis of 1D diffusion and transcription elongation of T7 RNA polymerase along individual stretched DNA molecules . Nucleic Acids Research 35(11) , 3848–3858 (2007).

[3] Team:Edinburgh UG/Modelling Collaboration - 2018.Igem.Org, n.d.

[4] Team:Estonia-TUIT/Modeling - 2023.igem.wiki

[5] Akira Ishihama,Functional Modulation of Escherichia Coli RNA Polymerase . ANNUAL REVIEW OF MICROBIOLOGY 54 , 499-518 (2000).

[6] Koh S, Noda S, Taguchi S,Population Dynamics in the Biogenesis of Single-/Multi-Layered Membrane Vesicles Revealed by Encapsulated GFP-Monitoring Analysis . Applied Microbiology 3(3) , 1027-1036 (2023).

[7] Horvat, P., Vrana Špoljarić, I., Lopar, M. et al.,Mathematical modelling and process optimization of a continuous 5-stage bioreactor cascade for production of poly[-(R)-3-hydroxybutyrate] by Cupriavidus necator . Bioprocess Biosyst Eng 36 , 1235–1250 (2013).

[8] Roh, E. H., Sullivan, M. O., & Epps, T. H.,A kinetic modeling platform for predicting the efficacy of siRNA formulations in vitro and in vivo . STAR protocols 3(4) , (2022)

[9]Deerberg et al. (2008). Journal of Molecular Biology, 379(3), 567-581

[10]Bartlett DW, Davis ME. Insights into the kinetics of siRNA-mediated gene silencing from live-cell and live-animal bioluminescent imaging. Nucleic Acids Res. 2006 Jan 12;34(1):322-33. doi: 10.1093/nar/gkj439. PMID: 16410612; PMCID: PMC1331994.

[11]Groenenboom et al. (2005). Journal of Theoretical Biology, 236(3), 327-341.

[12]Pete Corish, Chris Tyler-Smith, Attenuation of green fluorescent protein half-life in mammalian cells, Protein Engineering, Design and Selection, Volume 12, Issue 12, December 1999, Pages 1035–1040

Abstract

We are considering not only encapsulating shRNA in m-MV, but also displaying enzymes on its surface to give functionality to the membrane surface layer. Surface display is accomplished by joining enzymes with scaffold proteins, and we used Spycatcher/SpyTag for this joining. By utilizing this system, it becomes possible to display any enzyme on the surface of MOVE with proper design, allowing MOVE to have various characteristics. This is very important for the modularity of MOVE. In Dry Lab, we will confirm whether enzymes can function sufficiently on the MOVE surface through structure prediction using Alphafold, prediction of protein state during membrane presentation using OPM (orientation of protein), and molecular docking analysis using AutoDock Vina and HADDOCK 2.4 web server. Please see the figure below for details of the simulation.

Figure 1. Overview of Surface Presentation Model
Figure 1. Overview of Surface Presentation Model

Method

Below, we describe the evaluation procedure for surface display on MOVE. This is summarized very concisely and is available to anyone who wants to perform SDT Feasibility Analysis. This contributes to the modularization of the model.

Analysis Flowchart

  1. Structure acquisition/prediction: Obtain known structures from PDB
  2. Selection of E. coli outer membrane proteins:
    • Select appropriate outer membrane proteins such as OmpA and Lpp. This varies depending on the Da of the enzyme to be surface-displayed.
  3. Spycatcher/SpyTag sequence design:
    • The SpyTag sequence (AHIVMVDAYKPTK) can be attached to any position on the enzyme, not limited to the N-terminus or C-terminus, and still bind to Spycatcher. Incorporate the sequence into a part that minimally affects the active site.
  4. Confirm how many surface-displayed enzymes are presented per MOVE:
    • Using our expression level model for surface-displayed proteins, we can easily simulate this by inputting the base pair length.
  5. Scaffold protein and enzyme Spycatcher/Spytag fusion design:
    • Use Alphafold3 to model the fusion structure.
  6. Execute molecular dynamics simulation:
    • Use OPM to observe the state of the surface-displayed protein. This allows us to confirm whether the enzyme portion is successfully displayed on the membrane surface.
  7. Add linker or redesign SpyTag position (if necessary):
    • If interference is observed, add a linker or change the position of SpyTag.

Gene Design for Surface Display

We set the enzymes we use according to our purpose. Here, we will simulate and evaluate the enzymes we are actually dealing with: Mgfp-5, ChitinaseC (hereafter ChiC), and Endo-1, 3(4)-β-glucanase (glucanase).Here, we conducted an analysis according to the above flowchart for ChiC and glucanase, which are displayed to degrade chitin, a component of fungal cell walls, and allow shRNA to penetrate into the fungus, as well as for Mgfp-5, a mussel adhesive protein surface-displayed to prevent MOVE from falling off leaves and stems due to water flow from rain or pesticide spraying.

Figure 2. Design of the fusion sequence of Chic and Spytag
Figure 2. Design of the fusion sequence of Chic and Spytag

It is known that Spytag can be inserted at any position on the enzyme and still bind to Spycatcher. Therefore, we should aim to retain the enzyme’s activity during surface display by designing Spytag at the position that has the least impact on the enzyme and surface presentation.

Figure 3. Use the PDB data to identify regions that do not overlap with the motifs[1] .
Figure 3.  Use the PDB data to identify regions that do not overlap with the motifs[1].
Figure 3. Use the PDB data to identify regions that do not overlap with the motifs[1].

I will use PDB data to determine where the motifs are located in the enzyme and also identify the active site. By inserting the SpyCatcher sequence into the flexible linker region, the fusion protein’s activity is expected to be effectively maintained.

Figure 4. Design of the fusion sequence of Chic and Spytag
Figure 4. Design of the fusion sequence of Chic and Spytag
Figure 5. Design of the fusion sequence of OmpA and Spycatcher
Figure 5. Design of the fusion sequence of OmpA and Spycatcher

Here we show the ChiC_ST sequence and the OmpA_SC sequence. We needed detection using Histag for our experiments. By inserting the ST sequence at the C-terminus and N-terminus of the Histag, we enabled both detection using Histag and surface display therapy. For the OmpA_SC sequence, we use the same one as in the paper. This design is achieved by confirming the enzyme motif from the literature.

Models for expression levels of surface-presenting proteins

We use the Spycatcher/Spytag system to display multiple proteins on the surface of m-MVs. The number of these proteins displayed on m-MVs is crucial for confirming whether MOVE functions sufficiently. We have created a model to verify MOVE’s functionality by simulating DNA expression, RNA transcription, and Protein expression levels using ODEs.

Formula for RNA transcription

d[mRNA]dt=Tr~[Promoter][RNAP]Km+[RNAP]kdeg[mRNA]\frac{d[\text{mRNA}]}{dt} = \tilde{Tr}\frac{\left[\text{Promoter}\right]\left[\text{RNAP}\right]}{K_m + \left[\text{RNAP}\right]} - k_{deg}\left[\text{mRNA}\right]

Protein expression formula

d[Protein]dt=kprotmRNAkp,deg[Protein] \frac{d[\text{Protein}]}{dt}= kprot * mRNA - kp,deg * [\text{Protein}]

Spycatcher/Spytag assembly formula

d[sucf-Protein]dt=kaseembly[sucf][Protein]\frac{d[\text{sucf-Protein}]}{dt} = k_{aseembly} * [\text{sucf}] * [\text{Protein}]

Based on this equation, the approximate number of surface-displayed proteins per individual cell is predicted. From the steady-state expression level of the protein, the number of protein molecules is calculated and divided by the surface area of the cell membrane, and the number of enzymes per m-MOVE is determined. I have prepared a program to perform these calculations.

Structure prediction using Alphafold3

AlphaFold is a groundbreaking tool that uses deep learning to predict protein structures with high accuracy from amino acid sequences. Moreover, in May of this year, an improved version, AlphaFold3, was announced[2] , significantly enhancing its performance in complex predictions and interactions with non-protein molecules such as nucleic acids. By simply inputting an amino acid sequence, it can predict the structure of unknown proteins, making it incredibly useful for designing fusion proteins. For example, it serves as the first checkpoint to assess how the fusion affects the 3D structure, whether the function is retained, and what the optimal linker length should be. In this design process, AlphaFold3 is the most accessible and powerful tool.

Figure 6. Design of sequences for structure prediction of fusion proteins
Figure 6. Design of sequences for structure prediction of fusion proteins

When fusing these, since the reactive residues of Spycatcher/Spytag during binding are located in the middle of each sequence, they cannot be described as a single-chain amino acid. In this state, for structure prediction using Alphafold, it is basic to input a single-chain amino acid sequence. By defining the amino acid sequence of the fusion protein as described above, we can predict the approximate structure of the fusion protein. However, it should be noted that this is only an approximation and there are limitations to its accuracy.

Prediction of surface presentation at OPM

While AlphaFold allows for the confirmation of whether the 3D structure of protein blocks remains intact, using Orientations of Proteins in Membranes (OPM) is more reliable to confirm whether the enzyme portion is indeed displayed on the surface when surface-displayed on a membrane[3] . OPM is a web service that predicts how proteins are positioned within biological membranes and provides information about this arrangement. It is particularly useful for studying membrane-related proteins, as it clarifies the orientation in which a protein is inserted into the membrane, which parts are exposed to the membrane surface, and which parts are embedded within the membrane. This allows researchers to easily identify transmembrane regions and domains that bind to the membrane surface. The input for OPM is the PDB file of the fusion protein predicted by AlphaFold.

Activity Prediction via Docking Simulation or Molecular Dynamics (MD)

HADDOCK is molecular docking software used to predict interactions between molecules such as proteins, ligands, DNA, and RNA. Unlike traditional docking methods, HADDOCK utilizes experimental data provided by the user (e.g., NMR, mass spectrometry, cross-linking) to perform binding predictions. By using information on binding sites and contact residues, it can predict more realistic complex structures, taking into account molecular flexibility. It can be accessed via a web interface, making it relatively easy to perform analyses.On the other hand, molecular dynamics (MD) simulations simulate the movement of molecules over time and track the dynamic behavior of proteins and ligands. They can predict long-term interactions, structural stability, and changes in binding states that cannot be handled by docking simulations. Performing these analyses on fusion proteins is powerful for confirming their activity.In our case, since the structure of the fusion protein did not change at the motif level, we concluded that it was unnecessary to specifically use AutoDock Vina or HADDOCK for the fusion protein and instead relied on confirmations from existing literature.

Result

Output in the expression level model of surface-presenting proteins

Figure 7. Amount of surface presentation when expressing three types of proteins.
Figure 7. Amount of surface presentation when expressing three types of proteins.

Table1 Concentration of Pro-Sucf complex at 200 min
Concentration at 200 minM
Mgfp5-Suc complex2.0×10062.0 \times 10^{-06}
CHiC-Suc complex2.0×10072.0 \times 10^{-07}
Glucanase-Suc complex9.1×10089.1 \times 10^{-08}

When calculated as the number presented per single MOVE, we found that Mgfp5 is presented in hundreds, ChiC in tens, and glucanase is only presented in single-digit orders. This indicates that the surface presentation of these three types of patterns is unlikely to be successful.

From the results, we were able to determine the amount of protein presentation. When presenting three types of surface-displayed proteins, it became clear that suc was significantly deficient, indicating the need to increase the copy number of the plasmid encoding suc. The number of enzymes presented per MOVE is limited, and while experimental confirmation is necessary, improvements may be needed. ptf16 was insufficient due to its low copy number.

Figure 8. ChiC and scaffold protein production.
Figure 8. ChiC and scaffold protein production.

As a trial, we simulated the intracellular protein production levels of Sucf and ChiC. From this, we can see that the production of Sucf is overwhelmingly insufficient.

There are two ways to solve this problem. Reduce the types of proteins displayed on the surface. If we limit the types of surface-displayed enzymes to one, sucf will be used for only one enzyme, so more Mgfp5 should be displayed. Additionally, there is a technique to place the scaffold protein on a high-copy vector to increase the number of Sucf presentations.

Figure 9a.(left) Amount of surface presentation when expressing one types of proteins. Figure 9b.(right) Amount of surface presentation when expressing three types of proteins by using pBluescript II SK(-) .
Figure 9a.(left) Amount of surface presentation when expressing one types of proteins. Figure 9b.(right) Amount of surface presentation when expressing three types of proteins by using pBluescript II SK(-) .

Table2 Concentration of Pro-Sucf complex at 200 min of using pBluescript II SK(-)
Concentration at 200 minM
Mgfp5-Suc complex0.00020.0002
CHiC-Suc complex2.0×1052.0 \times 10^{-5}
Glucanase-Suc complex9.1×1069.1 \times 10^{-6}

When calculated as the number presented per MOVE, it appears that tens of thousands of Mgfp5 are presented, and thousands of ChiC and glucanase are presented, suggesting that the three surface presentations on MOVE will be successful.

The results when using pBluescript II SK(-) as a vector are as follows. This would increase the amount of enzymes presented on MOVE, making it better. The number of enzymes that should be surface-displayed on MOVE varies depending on the purpose, and wet lab experiments are ultimately essential to guarantee functionality, but this model can provide some assurance of success probability. Additionally, as the number of enzyme types increases, the number of enzymes that can be loaded onto individual MOVEs decreases. This is due to the limitation of suc concentration, and increasing scaffold protein expression levels is essential for presenting three types of enzymes. For future prospects, it is necessary to use high copy number vectors and high transcription promoters.

Output with Alphafold3

Figure 10. Bioreactor safely converges all concentrations
Figure 10. Bioreactor safely converges all concentrations
Figure 11. Structure Prediction of the Fusion Protein Using AlphaFold3
Figure 11. Structure Prediction of the Fusion Protein Using AlphaFold3
From the results of the 3D structure prediction of the fusion protein and the heatmap showing the Expected Position Error, we can see that the structure of the motifs themselves doesn’t change much, and the linker between them becomes a flexible linker. This indicates that the scaffold protein and the enzyme are likely to function without interfering with each other’s roles, which does not negate our protein design.

Output in OPM (orientetions of protein)

Figure 12. Bioreactor safely converges all concentrations
Figure 12. Bioreactor safely converges all concentrations
The purple color represents the scaffold protein OmpA, while the red color indicates the enzyme ChiC region. The blue represents the outer membrane of the lipid bilayer, and the red represents the inner membrane of the lipid bilayer. This shows the surface presentation for a 10nm MV. The average diameter of m-MVs is 100mM, so the curvature is smaller than this and closer to a plane. It has been confirmed that even in a planar state, the surface-presented protein can stably present its active site, indicating that the surface presentation is likely to be successful.

Activity Prediction via Docking Simulation or Molecular Dynamics (MD)

Figure 13. Interactions between ChBDChiC and tri-N-acetylchitotriose. The ligand molecule is well bound on the binding site by two stacking interactions (Trp59-NAG1 and Trp60-NAG3) and two hydrogen bonds (between Trp60-N and NAG2-O7 (carbonyl moiety of N-acetyl group of NAG2) and Trp59-NE1 and NAG2-O6). Hydrogen bonds are indicated by broken lines. Only Trp59 and Trp60 are responsible for the interactions [1] .
Figure 13. Interactions between ChBDChiC and tri-N-acetylchitotriose. The ligand molecule is well bound on the binding site by two stacking interactions (Trp59-NAG1 and Trp60-NAG3) and two hydrogen bonds (between Trp60-N and NAG2-O7 (carbonyl moiety of N-acetyl group of NAG2) and Trp59-NE1 and NAG2-O6). Hydrogen bonds are indicated by broken lines. Only Trp59 and Trp60 are responsible for the interactions [1].
Figure 13. Interactions between ChBDChiC and tri-N-acetylchitotriose. The ligand molecule is well bound on the binding site by two stacking interactions (Trp59-NAG1 and Trp60-NAG3) and two hydrogen bonds (between Trp60-N and NAG2-O7 (carbonyl moiety of N-acetyl group of NAG2) and Trp59-NE1 and NAG2-O6). Hydrogen bonds are indicated by broken lines. Only Trp59 and Trp60 are responsible for the interactions [1].

To verify whether the fusion protein can maintain its binding affinity with the substrate even when surface-displayed, a comparison was made with the OPM database. The results predicted that the activity of the binding site would be maintained even in the surface-displayed state, as the active site was not embedded within the membrane, and no significant changes were observed in the shape of the motif containing the active site. However, if there are concerns with the AlphaFold3 or OPM simulations, it may be necessary to consider redesigning the protein sequence. Furthermore, it is important to actually perform Activity Prediction as an additional verification to confirm how the protein’s structural changes affect its binding affinity to the substrate. This will allow for a more definitive judgment on whether the fusion protein can appropriately maintain its interaction with the substrate on the membrane surface.

Modularization

This is the file of the model we used. This makes it easy to reproduce the simulation we have done here.Click here .

References

[1]Berman, H. M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T. N., Weissig, H., … & Bourne, P. E. (2000). Protein Data Bank entry: 1WVU. RCSB Protein Data Bank.

[2]AlphaFold Server. (n.d.). About AlphaFold Server. AlphaFold Server.

[3]University of Michigan, Department of Pharmacology. (n.d.). Orientations of Proteins in Membranes (OPM) database. OPM Database.

Abstract

Models for single bioreactors and continuous cascade bioreactors were created for the large-scale production of MOVE. Each model is described by about 80 ordinary differential equations using COPASI. The created models were used to identify parameters to improve MOVE production efficiency using the MCA Tool, and were optimized through parameter scanning. As a result, the continuous cascade bioreactor enables approximately 100 times more efficient MOVE production compared to a single bioreactor. This result is utilized in the Economic model , contributing to the improvement of entrepreneurship’s business model by calculating drug prices and profits.

Introduction

In our project, we aim to create MOVE by encapsulating shRNA into m-MVs. Since MOVE can encapsulate various shRNAs, it has the potential for applications in many different fields. However, to expand its range of applications and establish it as a modular system, efficient production is essential. To achieve efficient MOVE production, we developed a bioreactor model and chose to use a continuous cascade system [1] .Based on the previous analysis, it is understood that the E. coli producing MOVE adjusts its metabolic flux depending on the nitrogen content in the medium (as shown in the Dry shRNA-MV model).The continuous cascade system offers the advantage of effectively controlling changes in metabolic flux compared to a single-batch process. This model aims to improve the production yield of MOVE through the use of a continuous cascade bioreactor, contributing to its application in various biotechnologies.

Feed back from Wet!

Wet lab experiments have shown that sufficient aerobic conditions are necessary for PHB production. Without adequate aerobic conditions, pyruvate undergoes fermentation into alcohol, and PHB is not produced.

Figure 1. Wet lab PHB production results. Under medium conditions with 2% glucose, the top image shows anaerobic
conditions, and the bottom image shows aerobic conditions. Under anaerobic conditions, the cell count is low,
and there is minimal PHB fluorescence detected by Nile Red. Under aerobic conditions,
an increase in both cell count and PHB fluorescence can be observed.
Figure 1. Wet lab PHB production results. Under medium conditions with 2% glucose, the top image shows anaerobic conditions, and the bottom image shows aerobic conditions. Under anaerobic conditions, the cell count is low, and there is minimal PHB fluorescence detected by Nile Red. Under aerobic conditions, an increase in both cell count and PHB fluorescence can be observed.

In reactors with a high glucose concentration in the culture medium, cell density increases, and the amount of PHB per unit volume also rises. However, this also increases the required amount of oxygen. Due to limitations in equipment and the physical properties of oxygen uptake, adjusting the balance between oxygen supply and the amount of glucose added is essential for MOVE production.

A system like a continuous cascade bioreactor inevitably involves a high level of complexity due to the interactions between different reactors. Therefore, careful analysis and design are required to achieve the intended engineering goals. Modeling is a valuable tool for accelerating the DBTL (Design, Build, Test, Learn) cycle. In our case, modeling was essential for addressing the following question:

How can we establish a stable continuous cascade system while maximizing the production of the target compound, MOVE?

This general question was expanded upon in various sections during the modeling work, allowing us to establish strong links with Entrepreneurship by addressing the following aspects of the project:

  • Demonstrating the feasibility of the project based on an efficient MOVE production bioreactor model.
  • Optimizing the process through model-driven identification of key control parameters.

These results are then integrated into an economic model to assess scalability and profitability at an industrial level, driving the feasibility of the project forward.

Method

To understand, predict, and ultimately control the behavior of the bioreactor, we developed a mechanical dynamic model of the bioreactor. The concentration balance of extracellular substrates and/or products in the bioreactor is modeled and described by a system of ordinary differential equations (ODEs). The final model integrates reactions that represent a wide range of processes, including microbial growth, metabolite production and utilization, transport between reactors, and diffusion. For the metabolic model of the PHB-producing E. coli, we used the same model as described in Markan Lopar’s paper[1] .

Representation of the model

Before constructing the mathematical model, we used standardized Systems Biology Graphical Notation (SBGN) to represent all the elements of the system and their interactions (Figure 2). For more details on the meaning of the various symbols, please refer to SBGM.

Figure 2. Representation of the model in Systems Biology Graphical Notation [3] .
Figure 2. Representation of the model in Systems Biology Graphical Notation [3].
Figure 2. Representation of the model in Systems Biology Graphical Notation [3].
Figure 3. Six-step bioreactor cascade for continuous MOVE production.
Figure 3. Six-step bioreactor cascade for continuous MOVE production.

Construction of the system of ODEs

In this section, we explain the structure of the model, the rate laws used to represent each reaction, and the balance equations for each component of the system. The units used in the model are liters (L) for volume, minutes (hour) for time, and millimoles (mmol) and grams of dry weight (gDW) for the amounts of metabolites and biomass, respectively.

Glucose , NH3 , O2 uptake

To ensure that E. coli responds to changes in extracellular nutrient concentrations (Glucose), the uptake of glucose is defined for both organisms using the Monod equation (Monod 1949), allowing the specific consumption rates to be calculated. The consumption rate q (mmol.gDCW⁻¹.h⁻¹) is expressed based on the maximum uptake rate of E. coli q_max (mmol.gDCW⁻¹.h⁻¹) and the half-saturation constant K_M (mM).

qGlu=qGlumax×(GluKGlu+Glu)(1)q_{\mathrm{Glu}} = q^{\mathrm{max}}_{\mathrm{Glu}} \times \left( \frac{\mathrm{Glu}}{K_{\mathrm{Glu}} + \mathrm{Glu}} \right) \qquad(1)

O2 gas transfer

The O2 balance can be represented by considering three processes: the input of O2 into the reactor through bubbling of O2-enriched air (I_O2), the transfer of O2 from the gas phase to the liquid phase (T_O2), and finally, the output of O2 from the reactor (O_O2).

dO2,gdt=IO2TO2OO2(2)\frac{d{\mathrm{O2,g}}}{d_{t}} = I_{\mathrm{O2}} - T_{\mathrm{O2}} - O_{\mathrm{O2}} \qquad(2) IO2=QgasO2input(3)I_{\mathrm{O2}} = Q_{\mathrm{gas}} \cdot \mathrm{O2_{input}} \qquad(3) TO2=κα(βO2gO2l)(4)T_{\mathrm{O2}} = \kappa \alpha \left( \beta \cdot \mathrm{O2_{g}} - \mathrm{O2_{l}} \right) \qquad(4) OO2=QgasO2g(5)O_{\mathrm{O2}} = Q_{\mathrm{gas}} \cdot \mathrm{O2_{g}} \qquad(5)

Glucose、 NH3 transfer

They are added by the flow rate v, which is a constant concentration. and can be expressed by the concentration in the medium.

dGludt=Glu_inputν(6)\frac{d \mathrm{Glu}}{dt} = \text{Glu\_input} \cdot \nu \qquad(6) dNH3dt=NH3_inputν(7)\frac{d \mathrm{NH}_3}{dt} = \text{NH}_3\_input \cdot \nu \qquad(7)

Analysis Method

We conducted the analysis using COPASI. We prepared a metabolic and reactor model by combining 82 equations, including the metabolic flux equations for PHB-producing E. coli and the reactor equations we developed. After measuring the sensitivity of each parameter using the MCA (Metabolic Control Analysis) tool built into COPASI, we optimized the controllable parameters that showed high sensitivity for MOVE production using a Parameter Scan.

Parameter

The parameters for the PHB-producing E. coli were based on Markan Lopar’s paper. A list of the equations and parameters used is attached in the appendix. The following parameters are specific to the reactor design and were newly added in our model.

  Table1 parameters used in the bioreactor model.
NameNotationvalueunitsource
DiMensioning
Bioreactor volume (R1,R2,R3,R4,R5,R6)V100Loperable
Medium flow rate to reactorvmediumv_{medium}1L h^-1operable
O2 gas transfer
Henry’s law coefficientβ\beta0.023[2]
Gas flow rate injected into the reactorqO2q_{O2}204L h^-1operable
Volume mass transfer coefficient of O2kLa30h^-1[2]
O2 concentration in influent gasO2gasinputO2_{gas_{input}}50mmol/Lcompressed oxygen
Glucose and NH3 transfer
Concentration of Glu added in the medium to (R1,R2)GLUliqinGLU_{liq_{in}}mMoperable
The ratio of glucose added between R3–R6 relative to R1–R2r1〜5operable
Concentration of NH3 added to the medium to R1NH3liqinNH3_{liq_{in}}mMoperable
Transformed E. coli
Maximum Glucose uptake rate of E. coliqGlumaxq_{Glu_{max}}12 ± 0.5mmol/gDCW/h[2]
Half-saturation constant of GlucoseKGluK_{Glu}2.7e-1mM[2]

Result

Simulation results of a single bioreactor

Figure 3. m-MV production concentration in a single bioreactor
Figure 3. m-MV production concentration in a single bioreactor

The model simulation of the single bioreactor confirmed that all species’ concentrations converge over time, indicating a stable model reactor. To produce MOVE, it is necessary to create a nitrogen-depleted medium condition, but this results in the cell concentration in the medium stabilizing at a low level, leading to lower MOVE production.

Simulation results of continuous cascade bioreactor

Figure 4. Bioreactor safely converges all concentrations
Figure 4. Bioreactor safely converges all concentrations
Figure 5a. Concentration of BIO in each bioreactor. Figure 5b. NH3 concentration in each bioreactor.
Figure 5a. Concentration of BIO in each bioreactor. Figure 5b. NH3 concentration in each bioreactor.
Figure 6a. Concentration of PHB in each bioreactor.  Figure 6b. Concentration of m-MV in each bioreactor.
Figure 6a. Concentration of PHB in each bioreactor. Figure 6b. Concentration of m-MV in each bioreactor.
The model simulation of the continuous cascade bioreactor confirmed that all species’ concentrations converge over time, indicating a stable model reactor. As expected, the results show that pre-culturing occurs in R1, followed by main culturing in R2, where cell concentration increases significantly and NH3 is consumed. In R3, PHB is produced, and in R4, R5, and R6, MOVE is produced on a large scale. The metabolic flux of the cells changes in each bioreactor, supporting efficient, specialized production of MOVE. Overall, these simulations are consistent and demonstrate the theoretical feasibility of the project.

Optimization of MOVE production

Through MCA, we were able to narrow down the key parameters we should focus on to optimize MOVE production efficiency. Among these, the easily adjustable parameters—glucose supply and the ratio of glucose added between R3–R6 relative to R1–R2—were found to have a significant impact. While increasing the glucose supply might seem like an obvious approach to increase MOVE production, the limitation imposed by oxygen supply makes it necessary to find the optimal value. Additionally, since the roles of the first and second halves of the continuous bioreactor differ, it is important to adjust the glucose supply between these sections accordingly.

Figure 7. Bioreactor safely converges all concentrations
Figure 7. Bioreactor safely converges all concentrations

The areas where the concentration of m-MV is not displayed are where the O2 concentration became negative. In these regions, anaerobic conditions prevail, making PHB production difficult, as indicated by the Wet lab results. From this analysis, it was determined that when r = 3.715 and glucose addition is 990.335 mmol/h , 0.00213 mmol of MOVE can be recovered in 1 hour.

In a single bioreactor, it was found that MOVE could be recovered at a rate of 2e-5 mmol/h. This allows for the production of pesticides for approximately 0.05 hectares in 1 hour. In contrast, the continuous cascade bioreactor was able to recover MOVE at a rate of 0.00213 mmol/h , which enables the production of pesticides for approximately 5 hectares in 1 hour. This represents a recovery rate about 100 times higher than that of the single bioreactor, demonstrating that the continuous cascade bioreactor allows for much more efficient MOVE production. This significantly enhances the feasibility of the business model in the economic analysis.

Process Scale and Profitability

As explained previously, our idea was to design an industrial biotech plant that would utilize our model to provide quantitative data and produce MOVE. Here is a brief description of some of the elements for which our model proved to be a useful predictive tool.

Cost Reduction Through Medium Optimization

As repeated before, a crucial factor for m-MV production is the nitrogen concentration in the medium. m-MVs are only produced under nitrogen-depleted conditions, and sufficient cell growth cannot be achieved without adequate nitrogen levels. Therefore, we can simulate the optimal nitrogen input for each reactor to ensure the ideal concentration in the medium. This prevents unnecessary nitrogen addition, reduces the risk of decreased MOVE production, and maintains cell density by avoiding nitrogen depletion.

Increase in MOVE Production Through Production Efficiency Optimization

Due to physical and equipment limitations on oxygen supply, simply increasing the glucose concentration in the medium does not lead to an increase in MOVE production. An excessive increase in glucose addition creates anaerobic conditions in the reactor, inhibiting the production of PHB and MOVE. We were able to identify the maximum glucose addition amount (R1, R2: 178.2 g/h; R3, R4, R5, R6: 662.0 g/h) that avoids oxygen depletion, optimizing MOVE production.

Cost savings with continuous bioreactors

In the continuous bioreactor, it was found that MOVE production costs could be reduced by approximately 27 times compared to the single bioreactor when considering NH3 and glucose as costs. This demonstrates the validity of using a continuous bioreactor for organisms with multiple fluxes depending on medium conditions. The continuous bioreactor model is effective for various transformants with multiple fluxes under different medium conditions.

Summary

In the continuous cascade bioreactor, significant improvements in both MOVE production per unit of medium and MOVE production per unit of time can be expected compared to the single bioreactor. This model has clarified how to design a bioreactor for efficient MOVE production. Additionally, it has demonstrated the potential for producing MOVE at the lowest possible cost, providing a foundation for proving the theoretical feasibility of the business in the economic model.

Modularization

This is the file of the model we used. This makes it easy to reproduce the simulation we have done here.Click here .

References

[1]Lopar, M., Špoljarić, I. V., Atlić, A., Koller, M., Braunegg, G., & Horvat, P. (2013). Five-step continuous production of PHB analyzed by elementary flux, modes, yield space analysis and high structured metabolic model. Biochemical Engineering Journal, 79, 57-70.

[2] BioNumbers. (n.d.). The database of key numbers in molecular and cell biology.

[3] Le Novère N, Hucka M, Mi H, Moodie S, Schreiber F, Sorokin A, Demir E, Wegner K, Aladjem MI, Wimalaratne SM, Bergman FT, Gauges R, Ghazal P, Kawaji H, Li L, Matsuoka Y, Villéger A, Boyd SE, Calzone L, Courtot M, Dogrusoz U, Freeman TC, Funahashi A, Ghosh S, Jouraku A, Kim S, Kolpakov F, Luna A, Sahle S, Schmidt E, Watterson S, Wu G, Goryanin I, Kell DB, Sander C, Sauro H, Snoep JL, Kohn K, Kitano H. The Systems Biology Graphical Notation. Nat Biotechnol. 2009 Aug;27(8):735-41. doi: 10.1038/nbt.1558. Epub 2009 Aug 7. Erratum in: Nat Biotechnol. 2009 Sep;27(9):864. PMID: 19668183.