Our goal is to achieve cost-effective and sustainable production of 5-ALA. We hope to improve the production efficiency by improving the stability and activity of ALAS in E. coli, and to develop a high-throughput screening method for high-yielding strains of 5-ALA to further reduce costs and improve economic benefits in order to solve the time-consuming and laborious problem of screening existing high-yielding strains.
To do this, we employ the Design-Build-Test-Learn (DBTL) loop, an important engineering framework in synthetic biology. Each cycle consists of multiple modules that build and optimize the final system step by step, and can be iterated over and over again to achieve full optimization.
First, we reviewed the relevant literature and discussed with the project leader and mentor to design a preliminary plan. Then, as we show on our Human Practice page, we interviewed experts and stakeholders to gather feedback on the context of the project and incorporate their recommendations into the project design. We applied DBTL loops in all aspects of the project. On this page, we'll explain in detail how to implement this cycle in our project design.
The pathways of microbial synthesis of 5-ALA mainly include C4 pathway and C5 pathway. In the C4 pathway, glycine and succinyl-CoA are catalyzed by 5-aminolevulinic acid synthetase (ALAS) to generate 5-ALA in a ratio of 1:1 under the action of coenzyme pyridoxal 5-phosphate (PLP). The C5 pathway uses glutamic acid as a precursor to generate 5-ALA through three enzymes: glutamine-tRNA synthetase (GltX), NADPH-dependent glutamine-tRNA reductase (HemA), and glutamine-1-semialdehyde aminotransferase (HemL). In contrast, the C5 pathway is more complex, while the C4 pathway requires only one step of enzyme-catalyzed reaction, and the exogenous substrate glycine is inexpensive. Therefore, we believe that the production of 5-ALA through the C4 route is a more economical and sustainable option.
In the C4 pathway, one of the key problems in promoting the synthesis of 5-ALA is the low enzyme activity of the ALAS currently being tried, so we tried to find an ALAS enzyme with the highest activity to promote the production of 5-ALA.
Through literature research, we selected some ALAS genes with high activity reported in the literature, which are from Rhodobacterium capsulatum, Agrobacterium, and Rhodopseudomonas palustris, respectively. We downloaded the gene information from the NCBI (National Center for Biotechnology Information (https://www.ncbi.nlm.nih.gov/)) according to the serial number, and synthesized and loaded it onto the pET24a plasmid vector for expression testing.
The ALAS genes derived from Agrobacterium, Rhodobacterium capsulatum and Pseudomonas marsh (AFhemA-ALAS, RChemA-ALAS, RPhemA-ALAS) were introduced to E. coli BL21(DE3) (the generated strains were named Af, Rc, Rp), and the growth (OD600) and 5-ALA yield of the recombinant strain were detected in real time. Then we selected ALAS genes with higher activity and suitable for further exploration for the next experiments.
At the same time, we also optimized the induction time and dose of IPTG to obtain the optimal expression of ALAS.
The results showed that after 24 h, the yield of 5-ALA in the Af, Rc and Rp was significantly higher than that in the control group. The yield of 5-ALA in each group stabilized after 24 h. At this time, the average yield of 5-ALA in the control group was only 0.05 g/L, while the average yield of 5-ALA in the Af, Rc and Rp groups reached 1.45, 1.67 and 1.55 g/L respectively, which was 280%, 324% and 300% of the control group. It can be seen that E. coli with RChemA-ALAS showed relatively higher yields of 5-ALA, and we decided to use the ALAS gene from Rhodobacterium capsulatum for the next study.
Next, we explored the dosage of IPTG that needs to be added and the timing of addition to obtain the optimal conditions for fermentation. We used different amounts of IPTG (0.05, 0.1, 0.15, 0.2 mmol/L) and different treatment times (2, 4, 6, 8 h) to induce ALAS expression. The results showed that the yield of IPTG at 0.05 mmol/L was significantly higher than that at other concentrations, and the maximum was 1.60 g/L. The best effect was achieved by adding IPTG after 2 h of fermentation.
In the end, we chose to add 0.05 mmol/L of IPTG as our fermentation conditions after 2 h of fermentation.
In order to improve the activity of ALAS, we modified ALAS enzyme molecules. Under the guidance of Prof. Yang Gu, we predicted a series of mutation sites through artificial intelligence calculations (Table. 1). Here, we cloned the ALAS gene in Rhodobacterium capsulatum on the PET24a plasmid to construct the PET24a-RCALAS recombinant plasmid, and introduced it into E. coli for fermentation detection.
Table. 1 Predicted 12 mutation sitesRChemA-ALASG58A | RChemA-ALASG60A | RChemA-ALASS139E | RChemA-ALASA218I | RChemA-ALASG244A | RChemA-ALAST245Q |
RChemA-ALASV252W | RChemA-ALASI292A | RChemA-ALASS378M | RChemA-ALASM393L | RChemA-ALASK13R | RChemA-ALASI203L |
Mutant construction was carried out on the 10 predicted mutation sites. E. coli BL21 (DE3) was used as the recipient cell to transform the constructed plasmid. After transformation, it was transferred to a test tube containing LB medium and incubated in a 37°C shaker incubator for culture and fermentation experiments.
The OD600 (detected growth) of each test tube after fermentation for 12 h, 24 h, 36 h, and 48 h was measured by spectrophotometer, and the yield of 5-ALA was measured by microplate reader after treatment (see Experiment).
Fig. 5 shows the OD600 of RC (non-mutated group) and 12 mutants at 12 h, 24 h, 36 h, and 48 h, and the OD600 values of each group increased steadily, and there was no big difference.
Fig. 6 shows the yield of 5-ALA corresponding to the 12 mutation sites, and the yield of 5-ALA tends to stabilize after 24 h. The results of 24 h fermentation are shown in the figure. We found that the 5-ALA yield of mutants RChemA-ALASG58A, RChemA-ALASG60A, RChemA-ALASS139E, RChemA-ALASG244A, RChemA-ALASI292A, RChemA-ALASS378M, and RChemA-ALASM393L was significantly higher than that of RC. Among them, the production of RChemA-ALASG244A and RChemA-ALASS378M is 264% and 238% of RC production, respectively. The above experimental results showed that we screened a series of beneficial mutation sites such as RChemA-ALASG244A and RChemA-ALASS378M, and effectively improved the activity of ALAS through enzymatic modification.
Enzyme modification is one of the core technologies of metabolic engineering, which can significantly increase the yield of target products, improve the production process or develop new biotechnology applications by changing the structure and activity of enzymes, providing important support for the biomanufacturing industry. Through this engineering cycle, we realized that enzyme modification is not a simple mutation and screening, but needs to consider a variety of factors, including the strength of the promoter, etc. This experience gave us a deeper understanding of the complexity and challenges of enzyme engineering, and the need for more detailed experimental design and parameter optimization. We believe that in the future, more beneficial mutation sites can be screened through enzymatic modification, so as to further improve the yield of 5-ALA.
After the first cycle of screening and condition optimization, we transferred the 5-aminolevulinate synthetase gene derived from Rhodobacterium capsulum into E. coli for expression to catalyze the synthesis of C4 pathway 5-ALA. However, during the fermentation, the yield of 5-ALA leveled off or even decreased after 24 h, and it was suspected that the plasmid may be lost after 24 h. Therefore, we performed validation and found that the plasmid loss rate was high for a certain period of time, as high as 55% at 36 h (as shown in Fig. 7), indicating that the genetic stability of the free plasmid is not high.
In order to improve plasmid genetic stability, we decided to utilize CRISPR-associated transposon (CAST) system [1] to insert the ALAS gene directly into a specific locus in the E. coli genome and eliminate the functioning free plasmid. Under the induction of tetracycline, this technology can randomly integrate the ALAS gene into one or more of the eight bypass pathway targets, improving the genetic stability of the ALAS gene while maximizing the carbon flux towards the 5-ALA synthesis pathway.
We constructed a pQCasTns(Ptr)- array 8 plasmid carrying the Tns transposase gene and array sequence (the plasmid map is shown in Fig. 8) and a PtDonor plasmid carrying the ribosomal binding site and ALAS, and transferred the two plasmids into the E. coli BL21 (DE3) strain to achieve site-directed genomic integration. After the successful integration was verified by experiments, the above two plasmids were eliminated by the pFree plasmid, and then the pFree plasmid was eliminated by sucrose stress [2].
Under tetracycline-induced conditions, multiple rounds of integration experiments were performed. Every 24 h of induction, 10 strains were randomly selected, and a total of 40 strains were screened after 48 h. After the plasmid elimination of these 40 integrated strains with pFree plasmid, the successful transposition of the strains was verified by fermentation, and the fermentation results are shown in Fig. 9.
Among the fermentation results of 40 transposable strains, the yield of strains 2, 3, 8 and 27 was significantly improved, especially No. 2, which was 74% higher than that of the control group, indicating that the expression of ALAS integrated insertion into the genome of E. coli was more stable than that of free expression, and the integration site would also affect the yield of the strains.
The intracellular metabolic activities of microorganisms are complex, and it is difficult to systematically understand their regulatory mechanisms and efficiently obtain the required phenotypes by a single research method. Genome-scale metabolic model (GSMM) is a mathematical model used to describe the relationship between genes, proteins, and responses, and has been widely used to analyze network properties, predict cell phenotypes, guide strain design, and analyze interactions. Based on GSMM, the enzyme-constrained model constructed by integrating large-scale enzyme kinetics and proteomics data has more accurate phenotypic prediction ability.
In our experiments, in order to improve the yield of 5-ALA, we used an enzyme-constrained model to simulate the effects of the following strategies on the target product from a global perspective while enzymatically modifying ALAS, combined with different genetic modification methods: 1) eliminate the competition pathway; 2) strengthen the synthesis of pathway-critical genes; 3) eliminate feedback inhibition; 4) optimize cofactors. This method is helpful to screen and combine different metabolic modification targets and guide experimental design, which not only realizes the rational design of strain modification, improves the efficiency of metabolic engineering, but also comprehensively considers the impact of genetic perturbation on microbial intracellular metabolism, so as to achieve precise metabolic flux regulation. We used the enzyme-constrained model ec_iECBD_1354 [3] that had been constructed in E. coli BL21(DE3) as the initial model, and since the initial model did not include relevant reactions regarding the synthesis of 5-ALA by the C4 pathway, we introduced the synthesis pathway of 5-ALA by the C4 pathway into the original model.
Succinyl-CoA[c] = >5-Amino-4-oxopentanoate[c]+ Coenzyme A[c] + CO2[c]
In addition, the transport and exchange reactions of 5-ALA are described:
Transport reaction:
5-Amino-4-oxopentanoate[c]=>5-Amino-4-oxopentanoate[e].
Exchange reaction:
And since the transport and exchange reaction of 5-ALA is a reversible reaction, catalyzing the reversible enzymes in two directions corresponding to two different kcat values, the reverse reaction of the transport and exchange reaction of 5-ALA is also introduced into the model.
We will tune the model to identify synthetic targets that can enhance 5-ALA. When comparing the differences in protein requirements between the cell growth stage and the product synthesis stage, 30 proteins were identified as the most demanded proteins that needed to be upregulated. While 10 proteins are classified as having reduced demand and need to be downgraded.
Table. 2 Increased demand for proteinGene | Function | Reaction | Changes(%) |
---|---|---|---|
ECBD_0309 | Aspartate-semialdehyde dehydrogenase | 40.0176 prot_pool[c] => prot_A0A140N4Y5[c] | -870.26 |
ECBD_0142 | Aldehyde dehydrogenase | 56.3358 prot_pool[c] => prot_A0A140N649[c] | -1474.25 |
ECBD_0950 | Enolase | 45.6544 prot_pool[c] => prot_A0A140N6G0[c] | -131.97 |
ECBD_1865 | Glyceraldehyde-3-phosphate dehydrogenase | 35.532 prot_pool[c] => prot_A0A140N783[c] | -131.96 |
ECBD_2786 | Glucose sorbosone dehydrogenase | 41.1042 prot_pool[c] => prot_A0A140N928[c] | -141.22 |
ECBD_1284 | Glucokinase | 34.7227 prot_pool[c] => prot_A0A140N9C3[c] | -90.52 |
ECBD_2932 | Succinate-CoA ligase | 29.7771 prot_pool[c] => prot_A0A140N9G2[c] | -4971.47 |
ECBD_3227 | Cytochrome bo(3) ubiquinol oxidase subunit 1 | 74.3671 prot_pool[c] => prot_A0A140NA88[c] | -111.89 |
ECBD_3182 | Adenylate kinase | 23.5858 prot_pool[c] => prot_A0A140NAW9[c] | -1505.18 |
ECBD_2798 | Fructose-6-phosphate aldolase | 22.9867 prot_pool[c] => prot_A0A140NB08[c] | -118.68 |
ECBD_2933 | Succinate--CoA ligase [ADP-forming] subunit beta | 41.3922 prot_pool[c] => prot_A0A140NBF4[c] | -4971.47 |
ECBD_3770 | Short-chain dehydrogenase/reductase SDR | 27.5624 prot_pool[c] => prot_A0A140NBP9[c] | -565.15 |
ECBD_3228 | Cytochrome bo(3) ubiquinol oxidase subunit 3 | 22.6224 prot_pool[c] => prot_A0A140NC92[c] | -111.89 |
ECBD_3226 | Ubiquinol oxidase subunit 2 | 34.955 prot_pool[c] => prot_A0A140NCS6[c] | -111.89 |
ECBD_2667 | Aminotransferase | 43.5729 prot_pool[c] => prot_A0A140ND68[c] | -477.41 |
ECBD_4298 | ATP synthase subunit alpha | 55.2216 prot_pool[c] => prot_A0A140ND72[c] | -108.75 |
ECBD_4013 | Aspartokinase | 48.5313 prot_pool[c] => prot_A0A140NEC0[c] | -870.23 |
ECBD_4105 | Triosephosphate isomerase | 26.9716 prot_pool[c] => prot_A0A140NEL5[c] | -125.01 |
ECBD_3229 | Cytochrome bo(3) ubiquinol oxidase subunit 4 | 12.0294 prot_pool[c] => prot_A0A140NER3[c] | -111.89 |
ECBD_4299 | ATP synthase gamma chain | 31.5772 prot_pool[c] => prot_A0A140NF41[c] | -108.75 |
ECBD_4083 | Bifunctional aspartokinase/homoserine dehydrogenase | 88.9453 prot_pool[c] => prot_A0A140NF74[c] | -1021.32 |
ECBD_4295 | ATP synthase subunit c | 8.2561 prot_pool[c] => prot_A0A140NFR9[c] | -108.75 |
ECBD_4297 | ATP synthase subunit delta | 19.3321 prot_pool[c] => prot_A0A140NFT6[c] | -108.75 |
ECBD_3615 | Homoserine kinase | 33.6094 prot_pool[c] => prot_A0A140NFW3[c] | -1097.26 |
ECBD_3614 | Threonine synthase | 47.0833 prot_pool[c] => prot_A0A140NG09[c] | -1097.27 |
ECBD_4294 | ATP synthase subunit a | 30.303 prot_pool[c] => prot_A0A140NGC2[c] | -108.75 |
ECBD_4300 | ATP synthase subunit beta | 50.3249 prot_pool[c] => prot_A0A140NHS0[c] | -108.75 |
ECBD_4296 | ATP synthase subunit b | 17.2638 prot_pool[c] => prot_A0A140NI98[c] | -108.75 |
ECBD_4068 | Phosphoenolpyruvate carboxylase | 99.0759 prot_pool[c] => prot_A0A140SS67[c] | -1052.38 |
ECBD_4301 | ATP synthase epsilon chain | 15.0681 prot_pool[c] => prot_A0A140SSC5[c] | -108.75 |
Gene | Function | Reaction | Changes% |
---|---|---|---|
ECBD_0084 | Orotate phosphoribosyltransferase | 23.5667 prot_pool[c] => prot_A0A140N4N0[c] | 98.78 |
ECBD_0115 | Glutaredoxin | 9.1374 prot_pool[c] => prot_A0A140N4R0[c] | 98.78 |
ECBD_0907 | Amino-acid acetyltransferase | 49.1948 prot_pool[c] => prot_A0A140N6K1[c] | 98.78 |
ECBD_0967 | Phosphoadenosine 5'-phosphosulfate reductase | 27.9905 prot_pool[c] => prot_A0A140N725[c] | 98.78 |
ECBD_0569 | Argininosuccinate synthase | 49.921 prot_pool[c] => prot_A0A140N7C3[c] | 98.78 |
ECBD_1133 | Serine hydroxymethyltransferase | 45.3161 prot_pool[c] => prot_A0A140N8X9[c] | 66.21 |
ECBD_2468 | Adenylosuccinate lyase | 51.5693 prot_pool[c] => prot_A0A140NCQ6[c] | 98.78 |
ECBD_4180 | 3-ketoacyl-CoA thiolase | 40.846 prot_pool[c] => prot_A0A140NDQ6[c] | 98.78 |
ECBD_3501 | Aconitate hydratase B | 93.4973 prot_pool[c] => prot_A0A140NFP9[c] | 98.78 |
ECBD_4265 | Ketol-acid reductoisomerase | 54.0685 prot_pool[c] => prot_A0A140NI72[c] | 98.78 |
In order to test whether the selected target genes are really related to 5-ALA synthesis, we randomly selected 10 targets to up-regulate, and fermentation was verified.
Table. 4 Upregulated targetsadk | tsE | baE | sucC | thrB | ppc | apE | sucD | asd | adE |
The ZZ-2 strain (transposon group 2 strain, whose 3rd target pta was inserted by the ALAS gene) was used as the chassis cell. For the genes that needed to be up-regulated, we replaced their original promoter and ribosome binding site (RBS) sequence genes with the strong promoter pJ23119 and the more efficient RBS sequence genes. The fermentation results are shown in the figure:
The fermentation results showed that among the 10 up-regulated targets, the increase in the yield of 5-ALA after the up-regulation of baE, ppc, apE and asd was the most obvious. At 36 h, the yields of the up-regulated targets above 36 h were 32%, 36%, 30%, and 39% higher than those of the control, respectively, indicating that our model was stable and the above targets were all effective targets.
In the study of regulating microbial metabolic networks, enzyme -onstrained models provide us with innovative ideas, allowing us to more accurately predict metabolic flux and product synthesis, and significantly shorten the time to find effective targets. However, we are also aware that the modification of certain targets may negatively affect strain growth or 5-ALA yield, suggesting that the current model still needs to be further refined. Through in-depth literature research, we found that genome-scale metabolic models (GSMMs) can be effectively integrated with enzyme kinetics, proteomics, transcriptomics, thermodynamics, and multiomics data. We planned to iteratively upgrade the existing model to enhance its target prediction capabilities.
In addition, there is an urgent need for innovation in the genetic modification of targets. Future experiments should focus on precise regulation of targets, such as applying CRISPR/Cas9 technology for detailed gene editing, or using metabolic regulatory elements to achieve dynamic regulation, rather than relying solely on traditional knockout or overexpression methods. Through this series of improvements, we believe that we can explore more effective targets more efficiently, thereby further increasing the yield of 5-ALA and opening up new possibilities for its synthesis.
In order to obtain higher performance strains, we performed ARTP mutagenesis on the modified engineering strains in order to obtain more mutant strains through this method.
Since the efficiency and accuracy of the screening method can have a huge impact on the strain screening process. Therefore, we designed a high-throughput screening technology based on droplet microfluidic chips. By exploring a suitable water-in-oil system, the modified strain was encapsulateed in droplets.
In this study, we constructed a self-made droplet generation chip based on glass capillaries, and used the T-channel method to screen the droplets formed by mineral oil + surfactant Span80 as the outer phase.
When the self-made chip and mineral oil system were tested, it was found that the generated droplets were emulsified and had a large diameter, which was not conducive to the growth and screening of bacteria.
Through literature review, we realized that the type of phase and the amount of surfactant in it have a great influence on the state of the droplet after formation, and the droplet diameter is closely related to the diameter of the chip pipe and the flow rate of the inner and outer phases. We decided to redesign the project based on these circumstances.
We decided to study the state of droplets with different types and amounts of surfactants in multiple phases and the same phase to screen out the optimal generation system. At the same time, since 5-ALA is a protoporphyrin precursor, we hope to take advantage of the characteristics of protoporphyrin that can produce fluorescent signal at the excitation wavelength of 408 nm to screen the droplets for fluorescence signal intensity, so as to screen out the strains with high production of 5-ALA.
We designed a PDMS chip using the coaxial flow focusing method, planned to control the droplet size by reducing the diameter of the chip pipe and adjusting the flow rate of the inner and outer phases, and constructed a variety of droplet external phase systems: mineral oil + Span80 system; Paraffin oil + EM90 system; Fluorinated oil + fluorinated surfactant system and other droplets are generated.
We completed the exploration of the above system under the PDMS chip, and found that the droplets formed by the fluorinated oil + 1% 008-FluoroSurfactant system in the PDMS chip were the most stable and suitable size, which met the screening conditions.
We used the excitation wavelength of 408 nm to detect the fluorescence signal intensity of protoporphyrin produced by E. coli, so as to sort the strains with strong fluorescence signal and high yield of 5-ALA. However, due to the lack of 408 nm fluorescence screening equipment, the weak fluorescence signal replaced by other excitation wavelengths, and the proportional relationship between protoporphyrin and the target product 5-ALA cannot be effectively verified by experiments, so the experimental design of fluorescence screening is difficult to carry out.
Through further literature review, we found that 5-ALA is a raw material for the synthesis of heme, and heme is a substance necessary for the growth of E. coli, so the yield of 5-ALA is related to the growth state of E. coli and the biomass of E. coli (Fig. 15). Based on this information, we decided to redesign the droplet screening method.
We plan to use the fluorinated oil + 1% 008-FluoroSurfactant system to calculate the concentration of the bacterial solution required to achieve the single-encapsulation condition of droplets through Poisson distribution, and use absorbance to detect the biomass of E. coli, so as to screen out the strains with better growth state.
We used the characteristic absorption peak of E. coli at 600 nm to biomass screen the formed droplets. By distinguishing the biomass OD size of the droplets, the strains with faster growth rates under the same conditions were sorted, and the screened droplets were broken up and coated on the plate to facilitate subsequent fermentation.
Finally, we decided to use fluorinated oil + 1% 008-FluoroSurfactant as the external phase system, based on the PDMS chip, to achieve single encapsulation of strains, and use absorbance equipment and dielectrophoresis chip to complete the high-throughput screening of biomass of high-yielding strains.
1. Strecker, J., et al., RNA-guided DNA insertion with CRISPR-associated transposases. Science, 2019. 365(6448): p. 48-53.
2. Lauritsen, I., et al., A versatile one-step CRISPR-Cas9 based approach to plasmid-curing. Microb Cell Fact, 2017. 16(1):135.
3. Zhang, Z.X., et al., Developing a dynamic equilibrium system in Escherichia coli to improve the production of recombinant proteins. Appl Microbiol Biotechnol, 2022. 106(18): 6125-6137.