Overview

Our project is dedicated to the efficient synthesis of 5-Aminolevulinic acid (5-ALA) by metabolic engineering of E. coli. Firstly, we conducted an in-depth study on the stability and activity of 5-Aminolevulinate synthase (ALAS), and effectively improved the activity and genetic stability of ALAS by screening different sources of ALAS, enzyme modification, and integrating ALAS into the genome by using CRISPR-associated transposons (CASTs) system. Meanwhile, we predicted the relevant targets in the 5-ALA synthesis pathway by using enzyme-constrained model and up-regulated the expression of some genes to enhance 5-ALA synthesis. In addition, we established an efficient screening method for high-yielding strains with the help of ARTP mutagenesis and droplet microfluidics.

Fig.1 Project introduction
Enzyme Modification and CRISPR-Associated Transposons (CASTs) System

Through literature research, we selected genes that can effectively express ALAS (RChemA-ALAS, AFhemA-ALAS, RPhemA-ALAS) from Rhodobacter capsulatus, Agrobacterium, and Rhodopseudomonas palustris and respectively loaded them into E. coli PET-24a plasmid vector, which was introduced into E. coli BL21(DE3) for expression(Hereinafter referred to as Af, Rc and Rp). E. coli BL21 (DE3) was used as a control, and the fermentation results are shown in Fig.2.

Fig.2 Fermentation levels corresponding to ALAS genes from different sources

Among them, E. coli introduced with RChemA-ALAS showed relatively higher 5-ALA production, and we decided to adopt this gene for the next experiments. The results showed that the growth rate of the strain slowed down after 24 h, and the 5-ALA production of each group leveled off after 24 h. At this time, the average 5-ALA yield of E. coli BL21 (DE3) was only 0.05 g/L, whereas the average 5-ALA yields of Af, Rc, and Rp imported with ALAS genes from Agrobacterium, Rhodobacter capsulatus , and Rhodopseudomonas palustris amounted to 1.45, 1.67, and 1.55 g/L, were 280%, 324% and 300% of the control group and showed higher activity.

In order to further improve the activity of ALAS and promote the production of 5-ALA, we decided to carry out enzyme modification. We predicted a series of mutations by artificial intelligence and composed them into plasmids for fermentation validation using E. coli introduced with unmutated RChemA-ALAS (hereafter RC) as a control.

Table 1 Predicted 12 mutation sites
RChemA-ALASG58A RChemA-ALASG60A RChemA-ALASS139E RChemA-ALASA218I RChemA-ALASG244A RChemA-ALAST245Q
RChemA-ALASV252W RChemA-ALASI292A RChemA-ALASS378M RChemA-ALASM393L RChemA-ALASK13R RChemA-ALASI203L

The fermentation results are shown in figures (Fig.2, Fig.3). As can be seen from the Fig.2, the value of OD600 increased steadily in each group without any major difference.

Fig.3 OD600 of the mutants and RC after fermentation

Fermentation yield measurements showed that the 5-ALA yield stabilized after 24 h. The results of 24 h fermentation are shown in Fig.4. We found that the 5-ALA yields of mutants RChemA-ALASG58A, RChemA-ALASG60A, RChemA-ALASS139E, RChemA-ALASG244A, RChemA-ALASI292A, RChemA-ALASS378M, and RChemA-ALASM393L were significantly higher than those of RC. Among them, the yields of RChemA-ALASG244A and RChemA-ALASS378M were 264% and 238% of those of RC, respectively. The above experimental results indicated that we screened a series of beneficial mutation sites, such as RChemA-ALASG244A and RChemA-ALASS378M, and through enzyme modification, we effectively improved the activity of ALAS.

Fig.4 5-ALA yield of the mutants compared with RC after fermentation

During the fermentation validation process, we found that the 5-ALA production tended to be constant or even decreased after 24 h of fermentation. We suspected that the plasmid expression level decreased after a certain time of fermentation for some reasons, and the most likely scenario was that the free plasmids were discarded over time. To test whether this hypothesis was valid, we performed plasmid elimination experiments and found that the plasmid loss rate was as high as 55% after 36 h of bacterial incubation (Fig.5). Therefore, improving the stability of plasmid expression is crucial for improving the efficiency of enzyme expression.

Fig.5 Plasmid loss rate versus time

To improve the genetic stability of the ALAS expression plasmid in E. coli , we used the CRISPR-associated transposons (CASTs) system to introduce ALAS gene fragments into the E. coli genome.CRISPR-associated transposons system is a special class of mobile genetic elements that combines the RNA guidance properties of the CRISPR system with the DNA mobility of transposons. CASTs enable precise insertion of large segments of DNA at specific sites in the genome without creating DNA double-strand breaks. The system contains the pQCasTns plasmid with targeting and transposition functions and the ptDonor plasmid with ALAS fragments. After the CRISPR effector complex recognizing and binding to the array target sequences, the N20 guides transposase Tns to achieve targeted insertion of ALAS. The origin, characterization and mapping of the two plasmids are shown below:

pQCasTns(Ptr)-array8 plasmid: It corresponds to the target fragment targeting sequences. Sequence synthesis and cleavage enzyme linkage sites can be expressed as [NcoI-“Repeat Sequence 1 + Guide Sequence + Repeat Sequence 2”+ “”+ BamHI], and the screening marker is kanamycin resistance (kan).

pDonor plasmid:Gibson assembly, screening labeled for chloramphenicol resistance (cm).

Fig.6 Plasmid mapping of pQCasTns: TetR (Tetracycline Repressor) protein can bind tightly to the manipulated sequences and prevent the formation of the transcription start complex, thus repressing the expression of the tetracycline gene; the spacer is the targetarray sequence, which can be specifically recognized by the N20 sequence.
Fig.7 ptDonor plasmid mapping

We used electroporation transfection to integrate the above two plasmids, pQCasTns and ptDonor, into E. coli for in vivo expression (the medium was supplemented with kan and cm), and at the same time, in order to improve the expression efficiency of the transposable elements, we used tetracycline resistance for multiple rounds of induction of the plasmids. After successful transposition, in order to further ensure the integration of ALAS gene into the E. coli genome to function stably and at the same time to avoid the resource competition of the newly introduced plasmids to other functional genes, we used pFree plasmid to discard the two free plasmids.

The strains that had been successfully transposed were subjected to fermentation. The results of the transposition fermentation are shown below, including 40 strains, of which RC was the strain that did not undergo transposition and served as the control group. Yield stabilized after 36 h. Several strains, such as 2, 3, and 4, had significantly higher yields than RC. Among the 40 transposed strains, strain 2 had the highest yield, which was 74% higher than that of the control group. The yields of other transposable strains were high and low, indicating that the insertion target site has a great influence on the yield of 5-ALA. This suggested that ALAS integrally inserted into the E. coli genome expression is more stable than free expression.

Fig. 8 Fermentation results of 40 transposable strains(RC as a control)
Enzyme-Constrained Model

Microbial intracellular metabolic activities are of high complexity, and a single research tool is difficult to systematically understand their regulatory mechanisms and cannot efficiently obtain the desired phenotypes. Genome-scale metabolic model (GSMM) is a mathematical model used to characterize gene-protein-response (GPR) relationships, which has been widely used to analyze network properties, predict cellular phenotypes, guide strain design and analyze interactions. The enzyme-constrained model constructed by integrating large-scale enzyme kinetics and proteomics data based on GSMM has more accurate phenotype prediction ability. In our experiments, in order to improve the yield of 5-ALA, we enzymatically modified ALAS. We hope that using the enzyme-constrained model, combined with different genetic modification methods, we can simulate the effects of modification strategies such as 1) elimination of competing pathways; 2) reinforcement of key genes of the synthetic pathway; 3) elimination of feedback inhibition; 4) optimization of cofactor, etc. on the target product at a global level, to help us screen out different metabolic transformation targets and combine them for guiding our experiments. This approach not only realizes the rational design of strain transformation and improves the efficiency of metabolic engineering transformation, but also can consider the effect of genetic perturbation on the whole microbial intracellular metabolism at the global level, which can help to realize the precise regulation of metabolic flow.

Table.2 Enzyme-constrained models versus traditional genome-scale metabolic models
Characteristics Enzyme-constrained models Genome-scale metabolic models
Enzyme kinetics considerations Considering enzyme concentration and catalytic efficiency to provide more realistic reaction dynamics Usually ignoreing enzyme kinetics and relying on metabolic homeostasis
Predictive accuracy Enabling more precise prediction of metabolic flow and product generation Lower precision, limited by assumptions and static modeling
Response to environmental change Flexibility to model the effects of environmental changes on metabolism Weak response to environmental change
Model complexity More sophisticated, capable of capturing subtle biochemical processes Simpler, easier to build and understand
Scope of application Ideal for studies requiring detailed metabolic regulation Suitable for large-scale genome analysis
Data requirements Detailed enzyme kinetic data required Primary reliance on genomic and metabolic pathway information

To evaluate the effectiveness of target genes, We selected ten of the predicted targets for upregulation. The experiment used ZZ-2 strain (transposome 2 strain) as chassis cells. For the genes to be up-regulated, we used the strong promoter pJ23119 for overexpression, together with an efficient ribsome binding site (RBS) sequence.

Fig.9 Map of targets to be up-regulated

Subsequently, we conducted fermentation experiments with these 10 engineered strains using ZZ-2 strain as a control group. We verified the effect of these gene regulation strategies by OD600 measurement and 5-ALA yield measurement during the fermentation process. The yield stabilized after 36 h and the fermentation results are shown in Fig.10.

Fig.10 5-ALA production at 36 h in engineered strains of ZZ-2 with target up-regulation Table.3 Up-regulated genes corresponding to expression products
Abridge Expression product
adk Adenylate kinase
tsE Threonine synthase
baE Bifunctional aspartate kinase/homoserine dehydrogenase
sucC Succinate coenzyme A synthetase beta subunit
thrB Homoserine kinase
ppc Phosphoenolpyruvate carboxylase
apE Aspartate kinase
thrS Threonine-tRNA synthetase
sucD Succinate coenzyme A synthetase alpha subunit
asd Aspartate-semialdehyde dehydrogenase

The fermentation results showed that among the 10 up-regulated targets, baE, ppc, apE and asd were better up-regulated. The yields corresponding to the above up-regulated targets were 32%, 36%, 30% and 39% higher than the control at 36 h, respectively, indicating that our model was stable and all the above targets were effective.

These targets are inextricably linked to aspartate and the TCA cycle. Among them, baE, apE, and asd express bifunctional aspartate kinase/homoserine dehydrogenase, aspartate kinase, and aspartate-semialdehyde dehydrogenase respectively, which are key enzymes of the amino acid metabolism pathway in microorganisms and can facilitate glycine synthesis. And ppc expresses phosphoenolpyruvate carboxylase, which catalyzes the reaction between phosphoenolpyruvate and carbon dioxide in the TCA cycle to produce oxaloacetate, in which phosphoenolpyruvate reacts with carbon dioxide to form oxaloacetate. Up-regulation of these genes may further drive the TCA cycle, facilitating the production of 5-ALA raw materials such as succinyl coenzyme A, which in turn drives the synthesis of 5-ALA.

Droplet Microfluidic High-Throughput Screening

After completing the modification of the E. coli strain, we designed a high-throughput screening technique based on microfluidic chip. By screening the suitable oil-in-water system, the modified strain was encapsulated in the oil phase. And we tried to utilize the fluorescence signal screening and biomass screening to realize the screening of high yielding strains.

1.Exploration of the optimal ARTP mutagenesis time in E. coli

In order to obtain more mutations, we performed ARTP mutagenesis on E. coli strains that had been transformed. By setting a time gradient and observing it after coated plate incubation, and comparing the growth of E. coli under each time gradient according to the standard observation that a lethality rate of 95% or more is the success of mutagenesis (Fig.11), we set 70 s as the optimal ARTP mutagenesis time for E. coli .

Fig.11 Growth of E. coli at different mutagenic time

2.Exploration of droplet encapsulating system

Droplets are formed by interfacial tension and shear force between oil and water phases, and surfactant is introduced to keep stability between droplets and prevent agglomeration, in addition, the flow rate will affect the size of droplet generation, so the oil phase, surfactant and flow rate are critical to the formation of droplets.

To maintain droplet stability, prevent droplet fusion and support prolonged incubation, we explored the effects of different oil phases and surfactants on droplets (Table 3). We tried a variety of systems as external phases to observe the effect of droplet generation, and we found that the droplets formed by fluorinated oil with 1% 008- FluoroSurfactant in the PDMS chip were the most stable, homogeneous, and did not emulsify easily during incubation, which was the most compatible with the conditions for droplet generation.

Table.4 Experimental system of inquiry and corresponding results
External phases Surfactant Results
Soybean Oil / Droplet fusion
8%F108 Droplet fusion
2%Span80 Droplets emulsify after 19 h
3%Span80 Droplets poorly homogenized, emulsified after 4 h
2%EM90 Droplets fused easily
3%EM90 Droplet homogeneity is good, long time without fusion and emulsification, difficult for bacteria to grow
ETPTA /
Soybean Oil / Droplet fusion
1%Span80 Droplets fused easily
2%Span80 Droplets do not fuse easily, emulsify after 19 h
3%Span80 Droplets not easily fused, emulsified after 10 h
4%Span80 Droplets do not fuse easily, emulsified after 4 h
5%Span80 Droplets do not readily fuse, emulsified after 2 h
2-Methyl Silicone Oil 2%EM90 Droplet fusion
3%EM90 Droplet fusion
Paraffin Oil 3%EM90 Droplet homogeneity is good, no fusion and emulsification for a long time, difficult for bacteria to grow.
Fluorinated oil 1%008- FluoroSurfactant Good droplet homogeneity, no fusion and emulsification for a long time, good growth of bacteria

The size of the droplets is closely related to the initial fluid velocity, the diameter of the droplet generation chip, and the properties of the inner and outer phase solutions, with the fluid velocity having the most significant impact. Based on literature and droplet generation size, we tested the effects of different flow rates on droplet diameter and found that droplet diameter is inversely proportional to the outer phase flow rate and directly proportional to the inner phase flow rate. Through experimentation, we determined that droplets with a diameter of approximately 90 μm are most suitable for subsequent screening of E. coli . Therefore, we ultimately selected an inner phase flow rate of 1 mL/h and an outer phase flow rate of 7 mL/h as the conditions for subsequent experiments.

The initial concentration of E. coli added to the internal phase solution and the rate of single encapsulateing were calculated according to the Poisson distribution formula. The Poisson distribution formula is given below:

(K denotes the number of cells contained in a single droplet; λ denotes the average number of cells contained in each droplet.)

The average diameter of the droplet: 90μm, then the droplet volume: V=4/3π*(90)3=3053628 μm3, which can be calculated that there are 3.27*105 droplets in 1 mL;

Let λ=0.4 and λ=(C/3.27*105), then the initial concentration of E. coli C=130992 droplets/mL;

Let K=1, then the droplet single encapsulating rate P=0.268.

3.Droplet generation and incubation

We generated droplets with a diameter size of about 90 μm (Fig.12) using a fluorinated oil system containing 1% 008- FluoroSurfactant at an internal-phase flow rate of 1 mL/h and an external-phase flow rate of 7 mL/h, and collected the droplets into the culture medium.

Fig.12 Real-time generation of dynamic maps of liquid droplets

The generated droplets were incubated in a 37°C incubator and sampled regularly and placed under a microscope. A significant increase in the number of bacteria within the droplets was observed, indicating that E. coli was growing well within the droplets (Fig.13). Subsequent screening was carried out when cultured to the logarithmic growth phase of E. coli.

Fig.13 Droplet generation morphology (left) Droplet incubation 24 h morphology (right)

4.Droplet screening and product determination

(1)Fluorescence screening

5-ALA is a precursor to protoporphyrin, and protoporphyrin exhibits a specific fluorescence signal under an excitation wavelength of 408 nm. Therefore, the fluorescence signal detected from the strain is positively correlated with its ability to produce 5-ALA. We aimed to measure the fluorescence signal intensity of protoporphyrin produced by E. coli using a 408 nm excitation wavelength, thereby enabling screening. Initially, we observed the products in droplets containing E. coli using a fluorescence microscope (Fig.14). However, due to the lack of a fluorescence screening device with a 408 nm wavelength, the fluorescence signal was unstable, and using other excitation wavelengths resulted in weaker fluorescence signals. Additionally, the proportional relationship between protoporphyrin and the target product 5-ALA could not be effectively verified through experiments. Factors above led to less than ideal fluorescence screening results.

Fig.14 Fluorescence signal observation diagram of a droplet

(2)Biomass screening

In the case of unsatisfactory fluorescence screening scheme, through literature review we found that 5-ALA is a raw material for the synthesis of heme, which is a substance necessary for the growth of E. coli, therefore, the high yield of 5-ALA is related to the growth status with E. coli.

Since the gltx gene in E. coli is the key gene for the synthesis of 5-ALA in the C5 pathway, we first constructed a plasmid and introduced it into E. coli to inhibit the expression of the gltx gene, and found that the growth of the strain was significantly inhibited. When we introduced ALAS with different activities into it, the growth status of E. coli was somewhat improved as seen by the OD600 data: the growth of E. coli was positively correlated with the production of 5-ALA. (Fig. 15) Therefore, we utilized this relationship and chassis with low activity ALAS and growth inhibition, and chose to use absorbance to detect the growth status of E. coli for screening.

Fig.15 Validation of the relationship between E. coli biomass (OD) and 5-ALA production

The first prerequisite for screening high-yielding 5-ALA strains using AADS is to ensure that a single strain is encapsulated in each droplet. However, the encapsulation of cells in a single droplet is not uniform but follows a Poisson distribution, meaning that many droplets are empty while others may encapsulate multiple cells. Therefore, before the experiment we controlled the concentration of E. coli at 1.3x105 cells/mL (λ=0.4) to achieve as many single encapsulations as possible.

The AADS technique utilizes absorption spectroscopy combined with microfluidics to differentiate the biomass OD size of droplets by measuring their absorption intensity of 600 nm light, thus sorting out droplets with faster E. coli growth (Fig.16).

Using a dielectrophoresis chip, we successfully sorted over 1000 positive droplets. After randomly selecting 20 droplets from the sorted batch and demulsifying them, we spread their contents on agar plates for cultivation. Once colonies grew on the plates, we conducted shake flask fermentation and then measured their 5-ALA production. The results showed that four strains (A3, A8, A15, A17) exhibited increased enzyme activity to varying degrees, with the highest enzyme activity increasing to 126.30% (Fig.17).

Fig.16 Droplet Sorting Video
Fig.17 Screening strain 5-ALA yield validation plot

In summary, we ultimately decided to screen for high-yield strains by leveraging the differences in bacterial biomass. Through our experimental investigation, using fluorinated oil with 1% fluorinated surfactant as the outer phase system and a PDMS chip as the base, we successfully generated stable droplets with a diameter of approximately 90 nm, achieving a screening throughput of millions per hour. Subsequently, we performed online biomass detection using OD600 as the absorbance indicator and utilized a dielectrophoresis chip for droplet sorting.

From the screened positive droplets, we randomly selected 20 for fermentation validation. The final results indicated that we successfully screened four strains capable of high 5-ALA production, with the highest enzyme activity increased to 126.30%.

Discussion

5-ALA is a sophisticated and sustainable biopesticide, especially when compared with the various highly toxic and non-degradable chemical pesticides currently in use. The exploration of the biological manufacturing of 5-ALA in recent years has driven our research. However, issues such as the stability of ALAS gene expression, the obsolescence of targets, and the low efficiency of screening for target strains still persist. This project addresses these three issues by utilizing CRISPR-associated transposons system, enzyme-constrained model, and droplet microfluidic high-throughput screening technologies to explore solutions.

By combining enzyme modification techniques, we have obtained some beneficial mutants (RChemA-ALASG258A, RChemA-ALASG260A, RChemA-ALASG244A, etc.) that can enhance the activity of the ALAS enzyme, significantly increasing the production of 5-ALA. Compared to traditional plasmid expression, CRISPR-associated transposons system represents an advanced engineering strain modification technology. It directly introduces the ALAS gene into the genome of the engineering strain, greatly improving the stability of gene expression and reducing the impact of plasmid loss on the genetic stability of gene expression. We hope that our approaches and achievements will inspire further exploration in this research field and provide additional guidance.

By utilizing the high-throughput screening method of droplet microfluidics, we have completed the screening of target strains more efficiently and accurately, reducing the time cost for the laboratory production of the biopesticide 5-ALA, which is beneficial for future research and promotion. Additionally, as a universal and efficient screening method, our experiment also provides a new idea for other teams to achieve efficient strain screening. We hope that our experimental results can offer some assistance to other teams in their experimental design.