Overview

Our goal is to achieve cost-effective and sustainable production of 5-ALA. We hope to improve the production efficiency by improving the stability and activity of ALAS in E. coli, and to develop a high-throughput screening method for high-yielding strains of 5-ALA to further reduce costs and improve economic benefits in order to solve the time-consuming and laborious problem of screening existing high-yielding strains.

To do this, we employ the Design-Build-Test-Learn (DBTL) loop, an important engineering framework in synthetic biology. Each cycle consists of multiple modules that build and optimize the final system step by step, and can be iterated over and over again to achieve full optimization.

First, we reviewed the relevant literature and discussed with the project leader and mentor to design a preliminary plan. Then, as we show on our Human Practice page, we interviewed experts and stakeholders to gather feedback on the context of the project and incorporate their recommendations into the project design. We applied DBTL loops in all aspects of the project. On this page, we'll explain in detail how to implement this cycle in our project design.

Cycle 1: Screening of ALAS genes from different species

Design

The pathways of microbial synthesis of 5-ALA mainly include C4 pathway and C5 pathway. In the C4 pathway, glycine and succinyl-CoA are catalyzed by 5-aminolevulinic acid synthetase (ALAS) to generate 5-ALA in a ratio of 1:1 under the action of coenzyme pyridoxal 5-phosphate (PLP). The C5 pathway uses glutamic acid as a precursor to generate 5-ALA through three enzymes: glutamine-tRNA synthetase (GltX), NADPH-dependent glutamine-tRNA reductase (HemA), and glutamine-1-semialdehyde aminotransferase (HemL). In contrast, the C5 pathway is more complex, while the C4 pathway requires only one step of enzyme-catalyzed reaction, and the exogenous substrate glycine is inexpensive. Therefore, we believe that the production of 5-ALA through the C4 route is a more economical and sustainable option.

Fig. 1 5-ALA synthesis pathway

In the C4 pathway, one of the key problems in promoting the synthesis of 5-ALA is the low enzyme activity of the ALAS currently being tried, so we tried to find an ALAS enzyme with the highest activity to promote the production of 5-ALA.

Build

Through literature research, we selected some ALAS genes with high activity reported in the literature, which are from Rhodobacterium capsulatum, Agrobacterium, and Rhodopseudomonas palustris, respectively. We downloaded the gene information from the NCBI (National Center for Biotechnology Information (https://www.ncbi.nlm.nih.gov/)) according to the serial number, and synthesized and loaded it onto the pET24a plasmid vector for expression testing.

Fig. 2 ALAS genes from different sources were composed into plasmids

Test

The ALAS genes derived from Agrobacterium, Rhodobacterium capsulatum and Pseudomonas marsh (AFhemA-ALAS, RChemA-ALAS, RPhemA-ALAS) were introduced to E. coli BL21(DE3) (the generated strains were named Af, Rc, Rp), and the growth (OD600) and 5-ALA yield of the recombinant strain were detected in real time. Then we selected ALAS genes with higher activity and suitable for further exploration for the next experiments.

At the same time, we also optimized the induction time and dose of IPTG to obtain the optimal expression of ALAS.

Learn

The results showed that after 24 h, the yield of 5-ALA in the Af, Rc and Rp was significantly higher than that in the control group. The yield of 5-ALA in each group stabilized after 24 h. At this time, the average yield of 5-ALA in the control group was only 0.05 g/L, while the average yield of 5-ALA in the Af, Rc and Rp groups reached 1.45, 1.67 and 1.55 g/L respectively, which was 280%, 324% and 300% of the control group. It can be seen that E. coli with RChemA-ALAS showed relatively higher yields of 5-ALA, and we decided to use the ALAS gene from Rhodobacterium capsulatum for the next study.

Fig. 3 Comparison of ALAS from different sources with E. coli BL21(DE3) fermentation results

Next, we explored the dosage of IPTG that needs to be added and the timing of addition to obtain the optimal conditions for fermentation. We used different amounts of IPTG (0.05, 0.1, 0.15, 0.2 mmol/L) and different treatment times (2, 4, 6, 8 h) to induce ALAS expression. The results showed that the yield of IPTG at 0.05 mmol/L was significantly higher than that at other concentrations, and the maximum was 1.60 g/L. The best effect was achieved by adding IPTG after 2 h of fermentation.

In the end, we chose to add 0.05 mmol/L of IPTG as our fermentation conditions after 2 h of fermentation.

Fig. 4 Optimization of IPTG addition quantity and addition time
Cycle 2: Modify ALAS and verify mutation sites

Design

In order to improve the activity of ALAS, we modified ALAS enzyme molecules. Under the guidance of Prof. Yang Gu, we predicted a series of mutation sites through artificial intelligence calculations (Table. 1). Here, we cloned the ALAS gene in Rhodobacterium capsulatum on the PET24a plasmid to construct the PET24a-RCALAS recombinant plasmid, and introduced it into E. coli for fermentation detection.

Table. 1 Predicted 12 mutation sites
RChemA-ALASG58A RChemA-ALASG60A RChemA-ALASS139E RChemA-ALASA218I RChemA-ALASG244A RChemA-ALAST245Q
RChemA-ALASV252W RChemA-ALASI292A RChemA-ALASS378M RChemA-ALASM393L RChemA-ALASK13R RChemA-ALASI203L

Build

Mutant construction was carried out on the 10 predicted mutation sites. E. coli BL21 (DE3) was used as the recipient cell to transform the constructed plasmid. After transformation, it was transferred to a test tube containing LB medium and incubated in a 37°C shaker incubator for culture and fermentation experiments.

Test

The OD600 (detected growth) of each test tube after fermentation for 12 h, 24 h, 36 h, and 48 h was measured by spectrophotometer, and the yield of 5-ALA was measured by microplate reader after treatment (see Experiment).

Learn

Fig. 5 shows the OD600 of RC (non-mutated group) and 12 mutants at 12 h, 24 h, 36 h, and 48 h, and the OD600 values of each group increased steadily, and there was no big difference.

Fig. 6 shows the yield of 5-ALA corresponding to the 12 mutation sites, and the yield of 5-ALA tends to stabilize after 24 h. The results of 24 h fermentation are shown in the figure. We found that the 5-ALA yield of mutants RChemA-ALASG58A, RChemA-ALASG60A, RChemA-ALASS139E, RChemA-ALASG244A, RChemA-ALASI292A, RChemA-ALASS378M, and RChemA-ALASM393L was significantly higher than that of RC. Among them, the production of RChemA-ALASG244A and RChemA-ALASS378M is 264% and 238% of RC production, respectively. The above experimental results showed that we screened a series of beneficial mutation sites such as RChemA-ALASG244A and RChemA-ALASS378M, and effectively improved the activity of ALAS through enzymatic modification.

Fig. 5 RC OD600 at different times with 12 mutants
Fig. 6 Percentage of 5-ALA yield of RC with 12 mutants

Enzyme modification is one of the core technologies of metabolic engineering, which can significantly increase the yield of target products, improve the production process or develop new biotechnology applications by changing the structure and activity of enzymes, providing important support for the biomanufacturing industry. Through this engineering cycle, we realized that enzyme modification is not a simple mutation and screening, but needs to consider a variety of factors, including the strength of the promoter, etc. This experience gave us a deeper understanding of the complexity and challenges of enzyme engineering, and the need for more detailed experimental design and parameter optimization. We believe that in the future, more beneficial mutation sites can be screened through enzymatic modification, so as to further improve the yield of 5-ALA.

Cycle 3: Insert ALAS into genome using a CRISPR-associated transposon system to stabilize expression

Design

After the first cycle of screening and condition optimization, we transferred the 5-aminolevulinate synthetase gene derived from Rhodobacterium capsulum into E. coli for expression to catalyze the synthesis of C4 pathway 5-ALA. However, during the fermentation, the yield of 5-ALA leveled off or even decreased after 24 h, and it was suspected that the plasmid may be lost after 24 h. Therefore, we performed validation and found that the plasmid loss rate was high for a certain period of time, as high as 55% at 36 h (as shown in Fig. 7), indicating that the genetic stability of the free plasmid is not high.

Fig. 7 Plasmid loss verification

In order to improve plasmid genetic stability, we decided to utilize CRISPR-associated transposon (CAST) system [1] to insert the ALAS gene directly into a specific locus in the E. coli genome and eliminate the functioning free plasmid. Under the induction of tetracycline, this technology can randomly integrate the ALAS gene into one or more of the eight bypass pathway targets, improving the genetic stability of the ALAS gene while maximizing the carbon flux towards the 5-ALA synthesis pathway.

Build

We constructed a pQCasTns(Ptr)- array 8 plasmid carrying the Tns transposase gene and array sequence (the plasmid map is shown in Fig. 8) and a PtDonor plasmid carrying the ribosomal binding site and ALAS, and transferred the two plasmids into the E. coli BL21 (DE3) strain to achieve site-directed genomic integration. After the successful integration was verified by experiments, the above two plasmids were eliminated by the pFree plasmid, and then the pFree plasmid was eliminated by sucrose stress [2].

Fig. 8 The left image shows the pQCasTns(Ptr)-array 8 map, and the right image shows the ptDonor plasmid map

Test

Under tetracycline-induced conditions, multiple rounds of integration experiments were performed. Every 24 h of induction, 10 strains were randomly selected, and a total of 40 strains were screened after 48 h. After the plasmid elimination of these 40 integrated strains with pFree plasmid, the successful transposition of the strains was verified by fermentation, and the fermentation results are shown in Fig. 9.

Learn

pflBpcksdhA encode phosphoenolpyruvate carboxykinase, formate acetyltransferase 1, and succinate dehydrogenase flavin protein subunits, respectively, they play an important role in the gluconeogenesis pathway, TCA cycle, and are essential for the normal physiological activities of cells. Therefore, these enzymes can be considered "must-have" in their respective metabolic pathways. Therefore, the integration of ALAS genes into their loci can mutate and even inactivate these key genes. Eventually, it made it difficult for us to integrate successfully.

Among the fermentation results of 40 transposable strains, the yield of strains 2, 3, 8 and 27 was significantly improved, especially No. 2, which was 74% higher than that of the control group, indicating that the expression of ALAS integrated insertion into the genome of E. coli was more stable than that of free expression, and the integration site would also affect the yield of the strains.

Fig. 9 Fermentation results of 40 transposable strains
Cycle 4: Enzyme-constrained model predicts novel targets

Design

The intracellular metabolic activities of microorganisms are complex, and it is difficult to systematically understand their regulatory mechanisms and efficiently obtain the required phenotypes by a single research method. Genome-scale metabolic model (GSMM) is a mathematical model used to describe the relationship between genes, proteins, and responses, and has been widely used to analyze network properties, predict cell phenotypes, guide strain design, and analyze interactions. Based on GSMM, the enzyme-constrained model constructed by integrating large-scale enzyme kinetics and proteomics data has more accurate phenotypic prediction ability.

In our experiments, in order to improve the yield of 5-ALA, we used an enzyme-constrained model to simulate the effects of the following strategies on the target product from a global perspective while enzymatically modifying ALAS, combined with different genetic modification methods: 1) eliminate the competition pathway; 2) strengthen the synthesis of pathway-critical genes; 3) eliminate feedback inhibition; 4) optimize cofactors. This method is helpful to screen and combine different metabolic modification targets and guide experimental design, which not only realizes the rational design of strain modification, improves the efficiency of metabolic engineering, but also comprehensively considers the impact of genetic perturbation on microbial intracellular metabolism, so as to achieve precise metabolic flux regulation. We used the enzyme-constrained model that had been constructed in E. coli BL21(DE3) as the initial model, and since the initial model did not include relevant reactions regarding the synthesis ec_iECBD_1354 of 5-ALA by the C4 pathway, we introduced the synthesis pathway of 5-ALA by the C4 pathway into the original model.

Succinyl-CoA[c] = >5-Amino-4-oxopentanoate[c]+ Coenzyme A[c] + CO2[c]

In addition, the transport and exchange reactions of 5-ALA are described:

Transport reaction:

5-Amino-4-oxopentanoate[c]=>5-Amino-4-oxopentanoate[e].

Exchange reaction:

5-Amino-4-oxopentanoate[e]=>

And since the transport and exchange reaction of 5-ALA is a reversible reaction, catalyzing the reversible enzymes in two directions corresponding to two different kcat values, the reverse reaction of the transport and exchange reaction of 5-ALA is also introduced into the model.

Test

We will tune the model to identify synthetic targets that can enhance 5-ALA. When comparing the differences in protein requirements between the cell growth stage and the product synthesis stage, 30 proteins were identified as the most demanded proteins that needed to be upregulated. While 10 proteins are classified as having reduced demand and need to be downgraded.

Table. 2 Increased demand for protein
Gene Function Reaction Changes(%)
ECBD_0309 Aspartate-semialdehyde dehydrogenase 40.0176 prot_pool[c] => prot_A0A140N4Y5[c] -870.26
ECBD_0142 Aldehyde dehydrogenase 56.3358 prot_pool[c] => prot_A0A140N649[c] -1474.25
ECBD_0950 Enolase 45.6544 prot_pool[c] => prot_A0A140N6G0[c] -131.97
ECBD_1865 Glyceraldehyde-3-phosphate dehydrogenase 35.532 prot_pool[c] => prot_A0A140N783[c] -131.96
ECBD_2786 Glucose sorbosone dehydrogenase 41.1042 prot_pool[c] => prot_A0A140N928[c] -141.22
ECBD_1284 Glucokinase 34.7227 prot_pool[c] => prot_A0A140N9C3[c] -90.52
ECBD_2932 Succinate-CoA ligase 29.7771 prot_pool[c] => prot_A0A140N9G2[c] -4971.47
ECBD_3227 Cytochrome bo(3) ubiquinol oxidase subunit 1 74.3671 prot_pool[c] => prot_A0A140NA88[c] -111.89
ECBD_3182 Adenylate kinase 23.5858 prot_pool[c] => prot_A0A140NAW9[c] -1505.18
ECBD_2798 Fructose-6-phosphate aldolase 22.9867 prot_pool[c] => prot_A0A140NB08[c] -118.68
ECBD_2933 Succinate--CoA ligase [ADP-forming] subunit beta 41.3922 prot_pool[c] => prot_A0A140NBF4[c] -4971.47
ECBD_3770 Short-chain dehydrogenase/reductase SDR 27.5624 prot_pool[c] => prot_A0A140NBP9[c] -565.15
ECBD_3228 Cytochrome bo(3) ubiquinol oxidase subunit 3 22.6224 prot_pool[c] => prot_A0A140NC92[c] -111.89
ECBD_3226 Ubiquinol oxidase subunit 2 34.955 prot_pool[c] => prot_A0A140NCS6[c] -111.89
ECBD_2667 Aminotransferase 43.5729 prot_pool[c] => prot_A0A140ND68[c] -477.41
ECBD_4298 ATP synthase subunit alpha 55.2216 prot_pool[c] => prot_A0A140ND72[c] -108.75
ECBD_4013 Aspartokinase 48.5313 prot_pool[c] => prot_A0A140NEC0[c] -870.23
ECBD_4105 Triosephosphate isomerase 26.9716 prot_pool[c] => prot_A0A140NEL5[c] -125.01
ECBD_3229 Cytochrome bo(3) ubiquinol oxidase subunit 4 12.0294 prot_pool[c] => prot_A0A140NER3[c] -111.89
ECBD_4299 ATP synthase gamma chain 31.5772 prot_pool[c] => prot_A0A140NF41[c] -108.75
ECBD_4083 Bifunctional aspartokinase/homoserine dehydrogenase 88.9453 prot_pool[c] => prot_A0A140NF74[c] -1021.32
ECBD_4295 ATP synthase subunit c 8.2561 prot_pool[c] => prot_A0A140NFR9[c] -108.75
ECBD_4297 ATP synthase subunit delta 19.3321 prot_pool[c] => prot_A0A140NFT6[c] -108.75
ECBD_3615 Homoserine kinase 33.6094 prot_pool[c] => prot_A0A140NFW3[c] -1097.26
ECBD_3614 Threonine synthase 47.0833 prot_pool[c] => prot_A0A140NG09[c] -1097.27
ECBD_4294 ATP synthase subunit a 30.303 prot_pool[c] => prot_A0A140NGC2[c] -108.75
ECBD_4300 ATP synthase subunit beta 50.3249 prot_pool[c] => prot_A0A140NHS0[c] -108.75
ECBD_4296 ATP synthase subunit b 17.2638 prot_pool[c] => prot_A0A140NI98[c] -108.75
ECBD_4068 Phosphoenolpyruvate carboxylase 99.0759 prot_pool[c] => prot_A0A140SS67[c] -1052.38
ECBD_4301 ATP synthase epsilon chain 15.0681 prot_pool[c] => prot_A0A140SSC5[c] -108.75
Table. 3 Declining demand for protein
Gene Function Reaction Changes%
ECBD_0084 Orotate phosphoribosyltransferase 23.5667 prot_pool[c] => prot_A0A140N4N0[c] 98.78
ECBD_0115 Glutaredoxin 9.1374 prot_pool[c] => prot_A0A140N4R0[c] 98.78
ECBD_0907 Amino-acid acetyltransferase 49.1948 prot_pool[c] => prot_A0A140N6K1[c] 98.78
ECBD_0967 Phosphoadenosine 5'-phosphosulfate reductase 27.9905 prot_pool[c] => prot_A0A140N725[c] 98.78
ECBD_0569 Argininosuccinate synthase 49.921 prot_pool[c] => prot_A0A140N7C3[c] 98.78
ECBD_1133 Serine hydroxymethyltransferase 45.3161 prot_pool[c] => prot_A0A140N8X9[c] 66.21
ECBD_2468 Adenylosuccinate lyase 51.5693 prot_pool[c] => prot_A0A140NCQ6[c] 98.78
ECBD_4180 3-ketoacyl-CoA thiolase 40.846 prot_pool[c] => prot_A0A140NDQ6[c] 98.78
ECBD_3501 Aconitate hydratase B 93.4973 prot_pool[c] => prot_A0A140NFP9[c] 98.78
ECBD_4265 Ketol-acid reductoisomerase 54.0685 prot_pool[c] => prot_A0A140NI72[c] 98.78

In order to test whether the selected target genes are really related to 5-ALA synthesis, we randomly selected 10 targets to up-regulate, and fermentation was verified.

Table. 4 Upregulated targets
adk tsE baE sucC thrB ppc apE sucD asd adE

The ZZ-2 strain (transposon group 2 strain, whose 3rd target pta was inserted by the ALAS gene) was used as the chassis cell. For genes that need to be upregulated, we overexpressed the strong promoter pJ23119 with a highly efficient ribosomal binding site (RBS) sequence. The fermentation results are shown in the figure:

Fig. 10 5-ALA yield of ZZ-2 with a target-upregulated engineered strain during fermentation

The fermentation results showed that among the 10 up-regulated targets, the increase in the yield of 5-ALA after the up-regulation of baE, ppc, apE and asd was the most obvious. At 36 h, the yields of the up-regulated targets above 36 h were 32%, 36%, 30%, and 39% higher than those of the control, respectively, indicating that our model was stable and the above targets were all effective targets.

Learn

In the study of regulating microbial metabolic networks, enzyme -onstrained models provide us with innovative ideas, allowing us to more accurately predict metabolic flux and product synthesis, and significantly shorten the time to find effective targets. However, we are also aware that the modification of certain targets may negatively affect strain growth or 5-ALA yield, suggesting that the current model still needs to be further refined. Through in-depth literature research, we found that genome-scale metabolic models (GSMMs) can be effectively integrated with enzyme kinetics, proteomics, transcriptomics, thermodynamics, and multiomics data. We planned to iteratively upgrade the existing model to enhance its target prediction capabilities.

In addition, there is an urgent need for innovation in the genetic modification of targets. Future experiments should focus on precise regulation of targets, such as applying CRISPR/Cas9 technology for detailed gene editing, or using metabolic regulatory elements to achieve dynamic regulation, rather than relying solely on traditional knockout or overexpression methods. Through this series of improvements, we believe that we can explore more effective targets more efficiently, thereby further increasing the yield of 5-ALA and opening up new possibilities for its synthesis.

Cycle 5: High-throughput screening of high-yielding 5-ALA strains

Design

In order to obtain higher performance strains, we performed ARTP mutagenesis on the modified engineering strains in order to obtain more mutant strains through this method.

Since the efficiency and accuracy of the screening method can have a huge impact on the strain screening process. Therefore, we designed a high-throughput screening technology based on droplet microfluidic chips. By exploring a suitable water-in-oil system, the modified strain was encapsulateed in droplets.

Build

In this study, we constructed a self-made droplet generation chip based on glass capillaries, and used the T-channel method to screen the droplets formed by mineral oil + surfactant Span80 as the outer phase.

Fig.11 Glass capillary generation chip

Test

When the self-made chip and mineral oil system were tested, it was found that the generated droplets were emulsified and had a large diameter, which was not conducive to the growth and screening of bacteria.

Fig.12 Droplets generated by mineral oil

Learn

Through literature review, we realized that the type of phase and the amount of surfactant in it have a great influence on the state of the droplet after formation, and the droplet diameter is closely related to the diameter of the chip pipe and the flow rate of the inner and outer phases. We decided to redesign the project based on these circumstances.

Design

We decided to study the state of droplets with different types and amounts of surfactants in multiple phases and the same phase to screen out the optimal generation system. At the same time, since 5-ALA is a protoporphyrin precursor, we hope to take advantage of the characteristics of protoporphyrin that can produce fluorescent signal at the excitation wavelength of 408 nm to screen the droplets for fluorescence signal intensity, so as to screen out the strains with high production of 5-ALA.

Build

We designed a PDMS chip using the coaxial flow focusing method, planned to control the droplet size by reducing the diameter of the chip pipe and adjusting the flow rate of the inner and outer phases, and constructed a variety of droplet external phase systems: mineral oil + Span80 system; Paraffin oil + EM90 system; Fluorinated oil + fluorinated surfactant system and other droplets are generated.

Fig.13 PDMS generates chip design drawings and physical drawings

Test

We completed the exploration of the above system under the PDMS chip, and found that the droplets formed by the fluorinated oil + 1% 008-FluoroSurfactant system in the PDMS chip were the most stable and suitable size, which met the screening conditions.

Fig.14 Droplets generated by fluorinated oils

Test

We used the excitation wavelength of 408 nm to detect the fluorescence signal intensity of protoporphyrin produced by E. coli, so as to sort the strains with strong fluorescence signal and high yield of 5-ALA. However, due to the lack of 408 nm fluorescence screening equipment, the weak fluorescence signal replaced by other excitation wavelengths, and the proportional relationship between protoporphyrin and the target product 5-ALA cannot be effectively verified by experiments, so the experimental design of fluorescence screening is difficult to carry out.

Learn

Through further literature review, we found that 5-ALA is a raw material for the synthesis of heme, and heme is a substance necessary for the growth of E. coli, so the yield of 5-ALA is related to the growth state of E. coli and the biomass of E. coli (Fig. 15). Based on this information, we decided to redesign the droplet screening method.

Fig.15 Validation of the relationship between E. coli biomass (OD value) and 5-ALA yield

Design

We plan to use the fluorinated oil + 1% 008-FluoroSurfactant system to calculate the concentration of the bacterial solution required to achieve the single-encapsulation condition of droplets through Poisson distribution, and use absorbance to detect the biomass of E. coli, so as to screen out the strains with better growth state.

Test

We used the characteristic absorption peak of E. coli at 600 nm to biomass screen the formed droplets. By distinguishing the biomass OD size of the droplets, the strains with faster growth rates under the same conditions were sorted, and the screened droplets were broken up and coated on the plate to facilitate subsequent fermentation.

Fig.16 Droplet sorting diagram

Finally, we decided to use fluorinated oil + 1% 008-FluoroSurfactant as the external phase system, based on the PDMS chip, to achieve single encapsulation of strains, and use absorbance equipment and dielectrophoresis chip to complete the high-throughput screening of biomass of high-yielding strains.

Reference

1. Strecker, J., et al., RNA-guided DNA insertion with CRISPR-associated transposases. Science, 2019. 365(6448): p. 48-53.

2. Lauritsen, I., et al., A versatile one-step CRISPR-Cas9 based approach to plasmid-curing. Microb Cell Fact, 2017. 16(1):135.

3. Cheng, Z., et al., Progress in metabolic engineering of microorganisms for the utilization of formate. Synth. Biol, 2023. 4(4): p. 756-778.