ENGINEERING
Overview

This project will integrate metagenomic samples from deep-sea ecosystems for data analysis, construct a set of deep-sea microbial genomes and a set of biosynthetic gene clusters, systematically analyze and sort out the genetic resources of secondary metabolites, and directionally explore new terpene secondary metabolites with high value biological activity and their biosynthetic gene clusters. Two terpene BGCs were obtained by optimization and direct cloning. Using a high-throughput biosynthesis system, we achieved the synthesis and expression of two terpene secondary metabolites and attempted to analyze the product structure. Specifically, we established two metabolic pathways in E.coli BL21, and produced new terpene secondary metabolites by constructing recombinant plasmids, transforming BL21, and fermentation.

Our team conducted metagenomic sequencing of deep-sea ecosystem samples obtained from 800 sampling sites around the world, and integrated deep-sea metagenomic samples in the public database to obtain a total of 2,138 deep-sea metagenomic sequencing data. MEGAHIT software was used to assemble the metagenomic data after quality control, and metaWRAP software was used to sort the microorganisms. The average nucleotide similarity ( gANI ) of the gene composition of the strains was calculated by fastANI, and the strains were divided by 95 % similarity. Using GTDB-tk to assign an objective phylogenetic position to each genome, a genome set of deep-sea metagenomic data can be obtained. Genome set based on deep-sea metagenome.AntiSMASH was used to identify and analyze the biosynthetic gene clusters and extract the Pfam domain features. Combined with the sampling geographical location and genomic phylogenetic status, the distribution of terpene biosynthetic gene clusters derived from deep-sea microorganisms was analyzed. The terpene biosynthetic gene cluster sequence and core gene protein sequence were extracted, and the sequence alignment was performed by DIAMOND software with the non-redundant protein sequence database in NCBI. The gene cluster with a similarity of less than 50 % of the core gene protein sequence was screened. Then, the gene cluster similarity was compared with the antiSMASH database and the MIBiG standard database, and the gene cluster with a similarity of less than 30 % was further screened as a potential new terpene biosynthetic gene cluster.Finally, BGCI and BGCII were selected, and then we carried out the wet experiment.

Part 1. pETDuet-BGCI-gene123, BBa_K5071016

Design

First, we obtained 4 target fragments using PCR technology. In order to improve the success rate of plasmid construction, we connected the 4 target fragments pairwise (by Overlap PCR), resulting in two fragments. Subsequently, we constructed a new plasmid by ligating the fragments with a vector using enzymatic digestion and ligation.

Figure 1. The plasmid map of pETDuet-BGCI-gene123

Build

Firstly, we utilized PCR technology to obtain three target genes, BGCI-1, BGCI-2, BGCI-3 (synthesized by a biotech company), with band lengths of 150 bp, 1200 bp, and 500 bp, respectively, for connection to the plasmid. Subsequently, we performed PCR to amplify the terminator of the first reading frame and the promoter of the second reading frame along with the intervening sequence in plasmid pETD (named as 123-mid), resulting in a 200 bp band. Figure 2 demonstrates bands of the expected sizes, confirming the successful acquisition of these four fragments. Gel electrophoresis was then conducted for gel extraction, which will be used in subsequent experiments.

Figure 2. The purpose segment of plasmid pETDuet-BGC1-gene123

Subsequently, we used overlap PCR technology to connect fragment BGCI-1 with BGCI-2, and 123-mid with BGCI-3, resulting in band lengths of 1300bp and 700bp, respectively. Figure 3A displays bands of the expected sizes, confirming successful connection. Following this, we performed double enzyme digestion on the plasmid using BamH1 and Xho1 restriction enzymes to linearize the plasmid, resulting in a band length of 4000bp. Figure 3B shows bands of the expected size, confirming successful linearization. We recovered the gel from both of these steps of gel electrophoresis and performed the connection, followed by transformation into E. coli DH5α.

Figure 3. The results of the overlap connection and plasmid linearization

We selected multiple colonies for PCR verification, and the bands matched the expected length (1200 bp). We sent the validated bacterial strains to a biotech company for sequencing (Figure 4), selected plasmids without mutations, and successfully obtained the constructed plasmid pETDuet-BGC1-gene123.

Figure 4. Single clone verification of pETDuet-BGC1-gene123 transformed E. coli DH5α. A. The results of colony PCR; B: The clones

Part 2. pRSFDuet-BGCI-gene456, BBa_K5071017

Design

The design concept is consistent with the above description.

Fig. 5. The plasmid map of pRSFDuet-BGCI-gene456

Build

Firstly, we utilized PCR technology to obtain three target genes, BGCI-4, BGCI-5, BGCI-6 (synthesized by a biotech company), with band lengths of 1300 bp, 800 bp, and 800 bp, respectively, for connection to the plasmid. Subsequently, we performed PCR to amplify the terminator of the first reading frame and the promoter of the second reading frame along with the intervening sequence in plasmid pETD (named as 456-mid), resulting in a 150 bp band. Figure 6 demonstrates bands of the expected sizes, confirming the successful acquisition of these four fragments. Gel electrophoresis was then conducted for gel extraction, which will be used in subsequent experiments.

The design concept is consistent with the above description.

Figure 6. The purpose segment of plasmid pRSFuet-BGC1-gene456

Subsequently, we used overlap PCR technology to connect fragment BGCI-4 with BGCI-5, and 456-mid with BGCI-6, resulting in band lengths of 2000 bp and 1000 bp, respectively. Figure 7A displays bands of the expected sizes, confirming successful connection. Following this, we performed double enzyme digestion on the plasmid using EcoR1 and Xho1 restriction enzymes to linearize the plasmid, resulting in a band length of 3500bp. Figure 7B shows bands of the expected size, confirming successful linearization. We recovered the gel from both of these steps of gel electrophoresis and performed the connection, followed by transformation into E. coli DH5α.

Figure 7. The results of the overlap connection and plasmid linearization

We selected multiple colonies for PCR verification, and the bands matched the expected length (800 bp). We sent the validated bacterial strains to a biotech company for sequencing (Figure 8), selected plasmids without mutations, and successfully obtained the constructed plasmid pRSFuet-BGC1-gene456.

Figure 8. Single clone verification of pRSFuet-BGC1-gene456 transformed E. coli DH5α. A. The results of colony PCR; B: The clones on the plate; C: Sequencing results

Part 3. pACYCDuet-BGCII-gene143, BBa_K5071018

Design

The design concept is consistent with the above description.

Fig. 9. The plasmid map of pACYCDuet-BGCII-gene143

Build

Firstly, we utilized PCR technology to obtain three target genes, BGCII-1, BGCII-4, BGC11-3 (synthesized by a biotech company), with band lengths of 180 bp, 150 bp, and 1200 bp, respectively, for connection to the plasmid. Subsequently, we performed PCR to amplify the terminator of the first reading frame and the promoter of the second reading frame along with the intervening sequence in plasmid pETD (named as pACY), resulting in a 160 bp band. Figure 10(Red marking) demonstrates bands of the expected sizes, confirming the successful acquisition of these four fragments. Gel electrophoresis was then conducted for gel extraction, which will be used in subsequent experiments.

Figure 10. The purpose segment of plasmid pACYCDuet-BGCII-gene143

Subsequently, we used overlap PCR technology to connect fragment BGC-4 with BGC-5, and pACY with BGC-6, resulting in band lengths of 2000 bp and 1000 bp, respectively. Figure 11A (Red marking) displays bands of the expected sizes, confirming successful connection. Following this, we performed double enzyme digestion on the plasmid using BamH1 and Xho1 restriction enzymes to linearize the plasmid, resulting in a band length of 3766 bp. Figure 11B (Red marking) shows bands of the expected size, confirming successful linearization. We recovered the gel from both of these steps of gel electrophoresis and performed the connection, followed by transformation into E. coli DH5α.

Figure 11. The results of the overlap connection and plasmid linearization

We selected multiple colonies for PCR verification, and the bands matched the expected length (1800 bp). We sent the validated bacterial strains to a biotech company for sequencing (Figure 12), selected plasmids without mutations, and successfully obtained the constructed plasmid pACYCDuet-BGCII-gene143.

Figure 12. Single clone verification of pACYCDuet-BGCII-gene143 transformed E. coli DH5α. A. The results of colony PCR; B: The clones on the plate; C: Sequencing results

Part 4. pETDuet-BGCII-gene792, BBa_K5071019

Design

The design concept is consistent with the above description.

Figure 13. The plasmid map of pETDuet-BGCII-gene792

Build

Firstly, we utilized PCR technology to obtain three target genes, BGCII-7, BGCII-9, BGC11-2 (synthesized by a biotech company), with band lengths of 1000 bp, 750 bp, and 2000 bp, respectively, for connection to the plasmid. Subsequently, we performed PCR to amplify the terminator of the first reading frame and the promoter of the second reading frame along with the intervening sequence in plasmid pETD (named as pETD), resulting in a 200 bp band. Figure 14(Red marking) demonstrates bands of the expected sizes, confirming the successful acquisition of these four fragments. Gel electrophoresis was then conducted for gel extraction, which will be used in subsequent experiments.

Figure 14. The purpose segment of plasmid pETDuet-BGCII-gene792

Subsequently, we used overlap PCR technology to connect fragment BGC-4 with BGC-5, and pETD with BGC-6, resulting in band lengths of 1700 bp and 2200 bp, respectively. Figure 15A (Red marking) displays bands of the expected sizes, confirming successful connection. Following this, we performed double enzyme digestion on the plasmid using BamH1 and Xho1 restriction enzymes to linearize the plasmid, resulting in a band length of 5178 bp. Figure 15B (Red marking) shows bands of the expected size, confirming successful linearization. We recovered the gel from both of these steps of gel electrophoresis and performed the connection, followed by transformation into E. coli DH5α.

Figure 15. The purpose segment of plasmid pETDuet-BGCII-gene792

We selected multiple colonies for PCR verification, and the bands matched the expected length (1800 bp). We sent the validated bacterial strains to a biotech company for sequencing (Figure 16), selected plasmids without mutations, and successfully obtained the constructed plasmid pETDuet-BGCII-gene143.

Figure 16. Single clone verification of pETDuet-BGCII-gene143 transformed E. coli DH5α. A. The results of colony PCR; B: The clones on the plate; C: Sequencing results

Part 5. pRSFDuet-BGCII-gene685, BBa_K5071020

Design

The design concept is consistent with the above description.

Figure 17. The plasmid map of pRSFDuet-BGCII-gene685

Bbuild

Firstly, we utilized PCR technology to obtain three target genes, BGCII-6, BGCII-8, BGCII-5 (synthesized by a biotech company), with band lengths of 500 bp, 350 bp, and 1500 bp, respectively, for connection to the plasmid. Subsequently, we performed PCR to amplify the terminator of the first reading frame and the promoter of the second reading frame along with the intervening sequence in plasmid pETD (named as pRSF), resulting in a 200 bp band. Figure 18(Red marking) demonstrates bands of the expected sizes, confirming the successful acquisition of these four fragments. Gel electrophoresis was then conducted for gel extraction, which will be used in subsequent experiments.

Figure 18. The purpose segment of plasmid pRSFuet-BGCII-gene685

Subsequently, we used overlap PCR technology to connect fragment BGCII-6 with BGCII-8, and pRSF with BGCII-5, resulting in band lengths of 850 bp and 1700 bp, respectively. Figure 19A (Red marking) displays bands of the expected sizes, confirming successful connection. Following this, we performed double enzyme digestion on the plasmid using BamH1 and Xho1 restriction enzymes to linearize the plasmid, resulting in a band length of 3587 bp. Figure 19B (Red marking) shows bands of the expected size, confirming successful linearization. We recovered the gel from both of these steps of gel electrophoresis and performed the connection, followed by transformation into E. coli DH5α.

Figure 19. The purpose segment of plasmid pRSFuet-BGCII-gene685

We selected multiple colonies for PCR verification, and the bands matched the expected length (1800 bp). We sent the validated bacterial strains to a biotech company for sequencing (Figure 20), selected plasmids without mutations, and successfully obtained the constructed plasmid pRSFDuet-BGCII-gene143.

Figure 20. Single clone verification of pRSFDuet-BGCII-gene143 transformed E. coli DH5α. A. The results of colony PCR; B: The clones on the plate; C: Sequencing results

Test:

1: Transformation of E. coli BL21

1.1 Strain-BGCI

In our target genes, the 6 genes of BGCI represent metabolic pathway 1, which are the 6 genes contained in plasmids pETDuet-BGCI-gene123 and pRSFDuet-BGCI-gene456. We simultaneously transformed these two plasmids into E. coli BL21 for the production of terpenoid compounds. The experimental results, as shown in Figure 21, depict the transformed E. coli BL21. We conducted single colony verification to confirm the presence of both plasmids, as illustrated in Figure 16. We obtained bacterial strains that correctly harbored both transformed plasmids, which we named as BGCI.

Figure 21. Colony PCR results of strain BGCI

1.2 Strain-BGCII

In our target genes, the 9 genes of BGCII represent metabolic pathway 1, which are the 9 genes contained in plasmids pACYCDuet-BGCII-gene143, pETDuet-BGCII-gene792 and pRSFDuet-BGCII-gene685. We simultaneously transformed these three plasmids into E. coli BL21 for the production of terpenoid compounds. The experimental results, as shown in Figure 22, depict the transformed E. coli BL21. We conducted single colony verification to confirm the presence of both plasmids, as illustrated in Figure 17. We obtained bacterial strains that correctly harbored both transformed plasmids, which we named as BGCII.

Figure 22. Colony PCR results of strain BGCII

2: Protein expression

2.1 Strain-BGCI

To assess the gene expression in the bacterial strains, we lysed the cells. To run a protein gel, start by preparing protein samples with a loading buffer and loading them into the gel wells. Run the gel at a constant voltage to separate proteins by size, then stain the gel to visualize the separated proteins. Finally, analyze the protein bands to interpret the results. We performed protein electrophoresis at different time points after IPTG induction, as shown in Figure 23, to detect the proteins expressing our target genes (BGCI-2 is 43.3kDa, BGCI-1 is 3.1kDA, BGCI-2 is 18.4kDa, BGCI-5 is 26.1kDa, BGCI-4 is 48.4kDA, BGCI-6 is 29.2kDa).

Figure 23. Protein gel results of strain BGCI

2.2 Strain-BGCII

The treatment method for strain BGCII was consistent with BGCI, and the experimental results, as shown in Figure 24, depicted the proteins expressing our target genes (BGCII-1 is 4.4kDa, BGCII-4 is 3.9kDA, BGCII-3 is 42.9kDa, BGCII-7 is 35.8kDa, BGCII-9 is 16.5kDA, BGCII-2 is 72.9kDa, BGCII-6 is 17.4kDa, BGCII-8 is 11.2kDA, BGCII-5 is 52.8kDa).

Figure 24. Protein gel results of strain BGCII

3: The test results for Total Antioxidant Capacity (T-AOC)

Various antioxidants and antioxidant enzymes in the fermentation broth contribute to the total antioxidant level. We used a Total Antioxidant Capacity assay kit (colorimetric method) for detection. The main principle is that DPPH is a stable free radical with maximum absorption at 515nm. Upon addition of antioxidants to the DPPH solution, a decolorization reaction occurs. Therefore, the change in absorbance can be quantified using Trolox as a control system to measure the antioxidant capacity of antioxidants. We first subjected the fermentation broth after 48 hours of fermentation to ultrasonic disruption: power 200W, ultrasound 3s, interval 10s, repeated 30 times, centrifuged at 10000rpm for 10 minutes at 4℃, followed by detection. The experimental results, as shown in Figure 25 and Table 1, revealed a significant increase in the DPPH scavenging rate for our genetically modified strains, from 4.58% to 40.80% and 49.45%, respectively. This demonstrates the success of our modification.

Table 1: DPPH scavenging rates of the genetically modified strains

Strain Absorbancy STD DPPH free radical clearance (%)

Control 0.146 0.0191 4.58

BGCI 0.089 0.0039 40.80

BGCII 0.076 0.0088 49.45

Figure 25. DPPH scavenging rates of the genetically modified strains

4: The test of the fermentation product antibacterial experiment

For the antibacterial activity testing of the fermentation broth, we utilized the double-layer agar plate method, with the bottom layer containing 1.5% LB solid medium and the top layer containing 0.8% LB solid medium poured after the bottom layer had cooled. Once the top layer reached an appropriate temperature, it was mixed with the cultured K-12 strain and poured into petri dishes. As shown in Figure 26, 4uL of the respective liquid was pipetted into each position. Each column represents three parallels of the same experimental group: 1. Positive control with ciprofloxacin concentration of 1g/L; 2. Positive control with ciprofloxacin concentration of 0.5g/L; 3. Concentrated 5-fold lysate supernatant after cell disruption; 4. Original lysate supernatant after cell disruption; 5. Squalene at 200mg/L. Our experimental results indicate that the concentrated 5-fold fermentation broth of strain BGCI exhibits some antibacterial effects, but we cannot determine the identity of this substance.

Figure 26. Results of the antibacterial experiment on the bacterial strains

5: Determination of squalene in the fermentation broth by HPLC

To determine if our target terpenoid compound is squalene, we conducted testing on the fermentation broth of the bacterial strains. The detection method involved the following steps: Fermentation was carried out using a biphasic fermentation method, with 10% volume of normal heptane added on top of the LBG medium. After fermentation, 1 mL of the 24-hour whole-cell catalytic liquid was taken, centrifuged at 13,000×g for 10 minutes, and the supernatant was discarded. Then, 400 μL of saline solution was added to wash the fermentation cells, centrifuged at 13,000×g for 10 minutes, and the supernatant was discarded. Next, ddH2O was added, thoroughly mixed, and brought to a volume of 400 μL. The cells were disrupted by ultrasonication at a working power of 20%, for 2 minutes with 3-second on and 5-second off cycles. Subsequently, 600 μL of ethyl acetate was added, mixed well, and subjected to ultrasonic cleaning twice for 15 minutes each. The mixture was then centrifuged, and 400 μL of the extract phase was obtained. The extract was concentrated using a vacuum centrifuge to evaporate the solvent, then re-dissolved in 200 μL of methanol, filtered through a 0.22 μm filter membrane, and ready for analysis. Squalene yield detection was performed using high-performance liquid chromatography (HPLC) under the following conditions: Column: Waters XBridgeTM C18 (3.5 μm 4.6 mm×150 mm); Column temperature: 35 ℃; Mobile phase: 100% pure acetonitrile; Flow rate: 1 mL/min; Detector: Photodiode array detector at 196 nm wavelength. The experimental results, as shown in Figure 27, indicated that after our modification, our target gene was extracted from a deep-sea metagenome terpenoid biosynthetic gene cluster. However, we could not determine the specific terpenoid compound produced through this pathway from the genetic information. Hence, we presumed it to be squalene. Yet, upon comparing the peak retention time of the fermentation broth with squalene standard, we found that squalene was not produced in our bacterial strains.

Figure 27. Detection results of squalene in the fermentation broth of the bacterial strains

5: Determination of squalene in the fermentation broth by GC-MS

The extraction method for squalene involves taking 50 mg of freeze-dried bacterial cells in a grinding tube, adding 2 grinding beads and 500 µL of methanol to each tube, and grinding in a grinder for 4 minutes. After removal, 1 mL of chloroform is added to each tube, and they are extracted in a constant-temperature shaker at 30°C and 200 rpm for 12 hours. The supernatant is collected after centrifugation at 12000 rpm for 10 minutes, dried using a nitrogen evaporator, re-dissolved in 1 mL of n-hexane, vortexed for 5 minutes, centrifuged at 12000 rpm for 10 minutes, and the supernatant is collected and filtered through a 0.22 µm organic membrane, then placed in brown gas chromatography vials. The determination method for squalene uses gas chromatography to detect squalene with the following gas phase conditions: the chromatographic column is an Rtx-5 capillary column (30 m × 0.32 mm × 0.25 µm); the injector temperature is set at 300°C; the detector temperature is set at 330°C; the carrier gas is nitrogen at a flow rate of 2 mL/min; the injection volume is 1 µL with a split ratio of 10:1; the detector used is a Flame Ionization Detector (FID); the column initial temperature is set at 200°C, maintained for 1 minute, then increased at a rate of 20°C/min to 280°C and maintained for 5 minutes. As shown in Figure 10 and Table 2, the squalene content was measured in the bacterial strain through LC-MS, with a yield of 6.60 mg/L. Due to the higher detection accuracy of LC-MS compared to HPLC, it is more precise and suitable for testing substances at low concentrations. Table 2: Detection results of squalene in the fermentation broth of the bacterial strains

Figure 28. Detection results of squalene in the fermentation broth of the bacterial strains

Learn

For the production of terpenoid compounds, we should consider knocking out some metabolic pathways within the bacterial strains to redirect more carbon flux towards the production of the desired compounds, thereby reducing the production of by-products.The exploration of the metagenome should be more in-depth to increase the chances of obtaining effective production components, which can better guide subsequent production processes.