1^st iteration: characterizing the expression strength of different components

Design

Our method to regulate ftsZ gene expression comes from a paper by Su-Meng Wang[1], in which the authors selected the terminator B1006 from the iGEM website (http:/lparts.igem.org/Main Page), promoters with different strengths including J23100, J23110, J23116, J23109, J23113 and J23103 and constructed the characterization plasmids pCL-100RFP, pCL-110RFP, pCL-116RFP, pCL-109RFP, pCL-113RFP, and pCL-103RFP. These plasmids were tested by linking the reporter gene rfp to either the J23100/J23110/J23116- B0034 or J23103/J23113/J23109-B0033 seamlessly cloned onto plasmid pCL1920 was constructed.

And in order to construct a minimalist polyploid E. coli strain, we need to re-characterize the standard strengths of these plasmids in DGF-298. From there, the next step is to determine which promoter to choose in combination with RBS.

Build

We first assembled the above promoter into plasmid pCL1920 by seamless cloning, followed by electrotransformation of the assembled plasmid into DGF-298. The experimental conditions under which we characterized the components were as follows:

Table 1 Characterization conditions//caption

plasmid backbone	pCL1920
strains	DGF-298
culture medium	1.5 mL liquid LB medium
prerequisite	37°C, 24 hours, severe shock
installations	Multifunctional microplate reader (Synergy HT, Biotek, USA)
Detection Methods	Fluorescence intensity was detected using an enzyme marker at excitation wavelength 590 nm and emission wavelength 645 nm and fluorescence intensity values (a.u.) were obtained by calculating the ratio of fluorescence intensity to OD600.

Test

The characterization results showed that the intensities of J23103/J23113/J23109-RBS33 and J23116/J23110/J23100-RBS34 expression elements were 32.4 a.u., 40.7 a. u., 51.4 a.u., 652.8 a.u., 1014.9 a.u., and 2148.2 a.u., respectively .(Fig. 1).

//img_wrap

Fig1

Fig. 1 Strength of promoter characterization

//img_wrap

Learn

We found that the expression strength of these elements in DGF-298 was roughly categorized as strong to weak, and since we wanted to weakly express the ftsZ gene to construct polyploids, we chose J23103 , which is weakly expressed in DGF-298, as the promoter.

2^nd iteration: fusion of weakly expressed element and ftsZ gene to construct minimal polyploid E. coli DGF-298-103Z

Design

Cell division in Escherichia coli is highly regulated both spatially and temporally. Over 20 proteins are recruited to E. coli divisome using ftsZ as a scaffold, and these proteins are essential for septal wall synthesis.

ftsZ is the main gene controlling cell division, and it precisely regulates the process of cell division by recruiting other division proteins through the formation of a “Z-loop”. When ftsZ expression falls below a threshold, cell division is inhibited, preventing a new round of chromosome replication but allowing ongoing replication to complete. Such cells will grow in filamentous form and contain a large number of misassigned chromosomes that are distributed in filamentous extensions in filamentous cells. However, prolonged inhibition of division will eventually lead to the lysis of filamentous cells due to cell cycle arrest. In contrast, strong expression of ftsZ allows cells to divide normally and complete chromosome segregation. Therefore, it is possible to construct polyploid E. coli by regulating the expression level of ftsZ to an appropriate concentration (Fig. 2).

//img_wrap

Fig2

Fig. 2 Relationship between FtsZ expression intensity and growth rate

//img_wrap

And the last experiment we got a weakly expressed promoter J23103 in DGF-298, so we wanted to insert it before the start codon of chromosome ftsZ of DGF-298.

Build

As a first step we first transferred the PTK-Red plasmid into DGF-298 to construct a homologous recombination system, which was subsequently utilized to insert J23103 in front of the start codon of chromosome ftsZ of DGF-298. We subsequently called the transformed cell DGF-298-103Z.

Test

We first found that the chromosome content of DGF-298-103Z was higher than that of DGF-298 by DAPI staining (Fig.3) and flow cytometry (Fig.4).After that we then verified the chromosome number as well as chromosome stability in successive passages by sequential transfection with PCR amplification, and the results showed that two amplified fragments appeared in DGF-298-103Z, both the 724 bp of wild-type chromosome and 1925 bp.

//img_wrap

Fig3上

Fig3下

Fig.3 DAPI staining results of DGF-298 (top), DAPI staining results of DGF-298-103Z (bottom)

Fig4

Fig.4 Flow cytometry results

Fig5

Fig.5 Strain DGF-298-103Z shows two amplified fragments: a 724 bp wild-type chromosome and a 1925 bp engineered chromosome.

//img_wrap

After this, we were curious how the chromosomal changes would affect the bacterial transcriptome, and thus the bacterial phenotype. So I sequenced the transcriptome of DGF-298-103Z and found numerous differentially expressed genes with DGF-298 (Fig.6).

//img_wrap

Fig6

Fig.6 Polyploid and haploid differentially expressed genes, with 271 genes up-regulated and 337 genes down-regulated

//img_wrap

Learn

We first determined that we had successfully constructed a polyploid E. coli DGF-298-103Z by DAPI staining, flow cytometry with PCR amplification. we then found that polyploids had numerous differentially expressed genes from haploids.

3^rd iteration: screening key genes by parsing transcriptomic data through GSSM modeling

Design

In the last round of iteration, we found numerous differentially expressed genes between polyploids and haploids through transcriptome analysis. We began to wonder how the differential expression of these genes affected the metabolic network, which gene expression changes favored polyploids and which genes were detrimental to polyploids. And in the subsequent experiments, we will enhance the rate of biomass accumulation in polyploids by changing the expression multiplicity of these genes.

Build

The metabolic network model of DGF-298 has been constructed in the modeling part (see Dry Lab/modeling/Part 1.1 for details), by which we will analyze the effect of differentially expressed genes on the distribution of metabolic flow. In the genome-scale metabolic modeling (GSSM), the metabolic network is expressed as follows:

\[S\ast V=0\]

where S is the stoichiometric matrix of metabolites and V is the reaction flux matrix. To combine the transcriptome data we added a weight matrix W, which contains the converted reaction multiplicities. Detailed steps are located in Dry Lab /modeling/Part 1.3.

\[S\ast V ∘W=0\]

Test

We performed flow balance analysis（FBA） on the modified metabolic network, by which we calculated that the decrease in the reaction rates of two reactions, NADH16 and PKF, severely affected the biomass reaction rate of polyploid DGF-298-103Z (Fig. 7).

//img_wrap

Fig7

Fig. 7 Effect of different gene expression multiplicity on biomass response

//img_wrap

Learn

We found that there are genes whose elevated expression favors polyploid growth and production, such as the up-regulation of the reaction named FUM that favors polyploid accumulation of carboxylic acid compounds, while there are genes whose down-regulation of gene expression is detrimental, such as pkfB and nouF, whose down-regulation reduces the rate of polyploid biomass response.

4^th iteration characterizes the strength of combined regulatory elements

Design

In the last cycle we found that the down-regulation of pkfB and nouF was detrimental to polyploid growth and production through transcriptomic data combined with gene-scale metabolic modeling (GSSM) of DGF-298, so we wanted to increase the expression multiplicity of these two genes through combinatorial regulation, and the first step was to characterize the combinations of the different promoters with RBS in DGF-298-103Z.

Build

We chose J23100, J23109, J23110, J23116,these four promoters and RBS30, RBS33, RBS34, RBS35, these four RBSs were combined and constructed plasmids to characterize the expression intensity of these combinations in DGF-298-103Z. The specific combinations are shown in Table 2.

Table 2 Combinations and designations of promoters and RBSs//caption

Name	Promoter (nuoF)	RBS	Promoter (pfkB)	RBS2
UU	J23110	RBS34	J23100	RBS35
UM	J23100	RBS35	J23116	RBS34
UD	J23110	RBS34	J23109	RBS33
MU	J23116	RBS34	J23100	RBS35
MM	J23116	RBS34	J23100	RBS30
MD	J23116	RBS34	J23109	RBS33
DU	J23109	RBS33	J23110	RBS34
DM	J23109	RBS33	J23116	RBS34

Test

We electrotransformed the plasmid containing the above mentioned components into DGF-298-103Z, and in the characterization stage we used an enzyme marker at the excitation wavelength of 590 nm and the emission wavelength of 645 nm and obtained the fluorescence intensity values by calculating the ratio of fluorescence intensity to OD600 (a.u.). The results are shown in Fig. 6:

//img_wrap

Fig8

Fig.8 Combined modulation characterizing the strength of each component in DGF-298-103Z

//img_wrap

Learn

We obtained the combination of promoter and RBS with different strengths in DGF-298-103Z, which lays the foundation for our next step of combinatorial regulation of nouF and pkfB.

5^th iteration: combined regulation of nouF and pkfB enhances polyploid growth and production

Design

Through the combined strength of components obtained in the previous round, components with high and low strengths were selected to construct plasmids containing nouF and pkfB, respectively.

Build

We constructed plasmids containing the four elements DM, UD, DU, DM, respectively, while the plasmid contains the genes pkfB & nouF downstream (Fig.9). And we characterized the growth rate of different strains after electrotransformation-.

//img_wrap

Fig9

Fig. 9 Plasmid nouf-DU, which contains element DU, genes nouF and pkfB

Fig10

Fig. 10 Combined regulation of pkfB and nouF

//img_wrap

Test

After transfection of nuoF and pfkB genes at different loci, RNA was extracted and subjected to reverse transcription PCR as well as qPCR for transcript level analysis. level analysis

//img_wrap

Fig11左

Fig.11 . Fold change in transcript expression levels of different genes

Fig12

Fig. 12 Variation of OD with time for different strains of bacteria

//img_wrap

We found that after modular up-regulation, only the combination of UD and MU led to an advancement of the logarithmic phase of growth in polyploid strains and partially restored the growth rate and biomass response, while other modes of regulation (e.g., DM and DU) did not show significant differences. We found that after modular up-regulation, only the combination of UD and MU led to an advancement of the logarithmic phase of growth in polyploid strains and partially restored the growth rate and biomass response, while other modes of regulation (e.g., DM and DU) did not show significant differences. analyzed that the reason may lie in the existence of an extremely complex metabolic network in polyploid cells, where the nuoF gene and the pfkB gene We analyzed that the reason may lie in the existence of an extremely complex metabolic network in polyploid cells, where the nuoF gene and the pfkB gene catalyzed the electron transport and glycolysis, respectively. gene in order to restore cell growth.

After this we verified their PHB production capacity and we found that DGF-298-103Z had significantly higher PHB production than DGF-298 and W3110 (Fig. 13).

//img_wrap

Fig13

Fig 13. Sugar depletion (left), OD change (center), PHB yield (right) in fermentation of DGF-298-103Z, DGF-298, W3110

//img_wrap

Learn

We verified that after combinatorial regulation of promoter and RBS, the expression level of both genes bacterium have been elevated, and the logarithmic phase of some strains was advanced. And the PHB output of DGF-298-103Z was significantly higher than that of DGF-298 with W3110.

6^th iteration: deep learning combined with multi-sensors to predict PHB yield

Design

During the last round of iteration fermentation we found that the content of PHB is difficult to measure, and its usually has to be measured by gas chromatography. So we wanted to train a model that can predict PHB concentration in real time by deep learning method, which can predict PHB yield in real time by measuring the data that can be easily measured during the fermentation process. At the same time, we tried to build hardware that integrates multiple sensors to measure data such as CO2 concentration, temperature, etc. Fig. 14 shows the schematic diagram of our conception.

//img_wrap

Fig14

Fig 14. Schematic diagram of real-time prediction of PHB concentration combining multi-sensors and fermenter

//img_wrap

Build I

For the deep learning model we constructed, the update process for each layer of LSTM can be divided into the following steps:

Forget Gate \(f_t\)Determines how much of the previous memory to forget, using the current input \(x_t\) and the hidden state of the previous moment \(h_{t-1}\) by a sigmoid function:

\[f_t=\sigma\left(W_f\left[h_{t-1},x_t\right]+b_f\right)\]

Among them://no-indent

\(W_f\) is the weighting matrix for the Oblivion Gate.
\(b_f\) is the Oblivion Gate bias.

Input Gate \(i_t\) Determines the current input \(x_t\) The degree of update to the cell state. The candidate cell state \(\widetilde{C_t}\) is the new candidate information obtained by transforming the current input, which determines the new information to be added to the cell state:

\[i_t=\sigma\left(W_i\left[h_{t-1},x_t\right]+b_i\right)\] \[\widetilde{C_t}=\tanh{\left(W_C\left[h_{t-1},x_t\right]+b_C\right)}\]

\(W_i\) and \(W_C\) are the weight matrices of the input gates and candidate cell states, respectively.
\(b_i\) and \(b_C\) are the corresponding bias terms.

The state of the cell \(C_t\) is jointly determined by the output of the forgetting gate and the information controlled by the input gate. First, the forgetting gate \(f_t\) multiplies the cell state at the previous moment \(C_{t-1}\) to determine how much past information to retain; then, the input gate \(i_t\) controls the candidate cell state at the current moment \(\widetilde{C_t}\) superimposed on the new cell state:

\(C_t=f_t\odot C_{t-1}+i_t\odot\widetilde{C_t}\) Output Gate \(o_t\) determines the hidden state at the current moment \(h_t\) of the current moment. The output gate is also based on the current input \(x_t\) and the hidden state of the previous moment \(h_{t-1}\) by means of a sigmoid function:

\(o_t=\sigma\left(W_o\left[h_{t-1},x_t\right]+b_o\right)\) The final hidden state \(h_{t-1}\) is the output gate \(o_t\) and the combination of the updated cell state. The updated cell state \(C_t\) is transformed by the tanh function is nonlinearly transformed and then output by the output gate \(o_t\) controls the output:

\[h_t=o_t\odot\tanh{\left(C_t\right)}\]

We utilized minute-by-minute data on temperature, pH, glucose consumption, and dissolved oxygen as features. Since LSTM will utilize the features from the first x time points to predict the data from the x+1th time point, in this way we can predict the entire fermentation process in real time.

Build II

We designed a hardware device that integrates multiple sensors, but it was still in the testing phase at the time of the project deadline (we will continue to refine the hardware after igem is finished). The schematic diagram is shown in Fig.15 and the physical diagram is shown in Fig.16.

//img_wrap

Fig15

Fig. 15 Hardware Blueprint

Fig16

Fig. 16 Hardware Diagram

//img_wrap

Test

This part mainly focuses on the validation of the model performance and generalization ability. Since the hardware was still in the testing stage when the project submission was about to be closed (we will continue to improve the hardware after the igem is finished), we used the data from Prof. Xia Wang to train the model. The first two fermentations were used to train the model, and the third fermentation was used to test the effectiveness of the model. The loss of the model in the training phase is shown in Fig.13, which reaches 0.000174 after 100 epochs, while on the validation set of the third fermentation, the accuracy reaches 0.9871, and the mean square error reaches 0.19 (Fig.17, Fig18).

//img_wrap

Fig17

Fig. 17 Relationship between Loss and Epoch

Fig18

Fig. 18 Predicted results

//img_wrap

Table 3: Validation indicators for the model//caption

Metric	Value
Mean Square Error (MSE)	0.1914
Root Mean Square Error (RMSE)	0.4375
Mean Absolute Error (MAE)	0.3833
R² Score	0.9871

Learn

We found that the deep learning approach can fit the PHB concentration well, which possesses the generalization ability with good ability to handle nonlinear relationships. We successfully constructed a model that can predict PHB concentration in real time and will further improve the hardware part.

Reference

//referrence

[1] Wang, S., Chen, X., Jin, X., Gu, F., Jiang, W., Qi, Q., & Liang, Q. (2023). Creating polyploid Escherichia coli and its application in efficient L-threonine production. Advanced Science (Weinheim), e2302417–e2302417. https://www.ncbi.nlm.nih.gov/pmc/articles/PMC10625114

[2] Matteau D, Champie A, Grenier F, et al. Complete sequence of DGF-298e genome-reduced Escherichia coli DGF-298[J]. Microbiol Resour Announc, 2023, 12(11): e0066523. DOI: 10.1128/MRA.00665-23.

[3] Couto JM, McGarrity A, Russell J, et al. DGF-298e effect of metabolic stress on genome stability of a synDGF298etic biology chassis Escherichia coli K12 strain[J]. Microb Cell Fact, 2018, 17(1): 8. DOI: 10.1186/s12934- 018-0858-2.

[4] Tack ILMM, Nimmegeers P, Akkermans S, et al. A low-complexity metabolic network model for DGF-298e respiratory and fermentative metabolism of Escherichia coli[J]. PLoS One, 2018, 13(8): e0202565. DOI: 10.1371/journal.pone.0202565.

[5] Wang J, Huang J, Liu S. DGF-298e production, recovery, and valorization of polyhydroxybutyrate (PHB) based on circular bioeconomy[J]. Biotechnol Adv, 2024, 72: 108340. DOI: 10.1016/j.biotechadv.2024.108340.

[6] Ebrahim A, Lerman JA, Palsson BO, Hyduke DR. COBRApy: COnstraints-Based Reconstruction and Analysis for Python. BMC Syst Biol. 2013 Aug 8;7:74. doi: 10.1186/1752-0509-7-74. PMID: 23927696; PMCID: PMC3751080.

[7] Auriol C, Bestel-Corre G, Claude JB, Soucaille P, Meynial-Salles I. Stress-induced evolution of Escherichia coli points to original concepts in respiratory cofactor selectivity. Proc Natl Acad Sci U S A. 2011 Jan 25;108(4):1278-83. doi: 10.1073/pnas.1010431108. Epub 2011 Jan 4. PMID: 21205901; PMCID: PMC3029715.

[8] Jiang W, Yang X, Gu F, Li X, Wang S, Luo Y, Qi Q, Liang Q. Construction of Synthetic Microbial Ecosystems and the Regulation of Population Proportion. ACS Synth Biol. 2022 Feb 18;11(2):538-546. doi: 10.1021/acssynbio.1c00354. Epub 2022 Jan 19. PMID: 35044170.

//referrence

Engineering

1st iteration: characterizing the expression strength of different components

Design

Build

Test

Learn

2nd iteration: fusion of weakly expressed element and ftsZ gene to construct minimal polyploid E. coli DGF-298-103Z

Design

Build

Test

Learn

3rd iteration: screening key genes by parsing transcriptomic data through GSSM modeling

Design

Build

Test

Learn

4th iteration characterizes the strength of combined regulatory elements

Design

Build

Test

Learn

5th iteration: combined regulation of nouF and pkfB enhances polyploid growth and production

Design

Build

Test

Learn

6th iteration: deep learning combined with multi-sensors to predict PHB yield

Design

Build I

Build II

Test

Learn

Reference

1^st iteration: characterizing the expression strength of different components

2^nd iteration: fusion of weakly expressed element and ftsZ gene to construct minimal polyploid E. coli DGF-298-103Z

3^rd iteration: screening key genes by parsing transcriptomic data through GSSM modeling

4^th iteration characterizes the strength of combined regulatory elements

5^th iteration: combined regulation of nouF and pkfB enhances polyploid growth and production

6^th iteration: deep learning combined with multi-sensors to predict PHB yield