Results



1. The schematic design of semi-random mutagenesis using cytidine base editor-T7 RNA polymerase (CBE-RNAP)

For our second-generation black box, we first predicted the structure of the fusion proteins rAPOBEC1-n-Mag-N-terminal T7-RNA polymerase (RNA, 1-179) and p-Mag-C-terminal T7-RNA polymerase (RNA, 180+). As shown in Figure 1A, we predicted that the two fusion proteins will bind together to form a complex under blue light and bind to the T7 promoter, inducing random C to T mutations in the specific region from T7 promoter to T7 terminator. When the blue light illumination stops, the two fusion proteins will separate again in the dark environment, causing the mutation to cease. Therefore, we further predicted that compared to the unmutated sample, the Green Fluorescent Protein (GFP) gene in the second-generation black box plasmid will ultimately present two results after random mutation: negative mutations that reduce the GFP fluorescence and positive mutations that enhance the GFP fluorescence (as shown in Figure 1B).


Figure1. Schematics of blue-light inducible base editor with split RNA polymerase and predicted GFP fluorescence results. A, Structure prediction of fusion proteins rAPOBEC1-n-Mag-N-terminal T7-RNA polymerase (RNA, 1-179) and p-Mag-C-terminal T7-RNA polymerase (RNA, 180+). B, Two predicted results of GFP fluorescence in mutated samples.

2. The validation in semi-random mutagenesis on GFP gene by CBE-RNAP

To verify the above theoretical predictions, we constructed the plasmid (as shown in Figure 2A) of the second-generation black box (CBE-Mag-RNAP). Subsequently, we used agarose gel electrophoresis to verify the successful construction of the plasmid. As shown in Figure 2B, the band positions shown in the electrophoresis results were consistent with the predicted restriction enzyme sites in the plasmid, indicating that the second-generation plasmid construction was successful. We then transformed the plasmid into Escherichia coli (E. coli) BL21 and set up a control group (where the addition of water will not activate the second-generation black box) and an experimental group (where the addition of IPTG can activate the second-generation black box), and placed them under the blue light. Fluorescence microscopy (as shown in Figure 2C) revealed that compared to the control group, the experimental group showed a significant increase in fluorescence intensity, indicating enhanced GFP expression in the experimental group. We further quantified the fluorescence intensity of the two groups, as shown in Figure 2D. Compared to the control group, where the fluorescence intensity was generally concentrated around 30, the experimental group had some points significantly higher than 30 and some significantly lower than 30, indicating that the second-generation black box successfully induced both positive and negative mutations in GFP. We further sequenced the plasmids from the control and experimental groups after the mutations, and the results (as shown in Figure 2E) revealed that the experimental group had randomly mutated the C in the control group to T, confirming the successful construction of the second-generation black box plasmid.


Figure2. Verification of the second-generation black box’s success. A, Plasmid of the second-generation black box (CBE-Mag-RNAP). B, Agarose gel electrophoresis result of the plasmid of the second-generation black box (CBE-Mag-RNAP). C, Fluorescence microscopy results of the experimental group and the control group. D, The GFP fluorescence intensity of the experimental group and the control group. E, Sequencing results of the experimental group and the control group.

3. High throughput screen for "gain-of-function" mutations on GFP by CBE-RNAP

At the same time, in order to compare the differences in mutation frequencies caused by the second-generation black box (CBE-Mag-RNAP) and the first-generation black box (CBE-RNAP), we transformed the two plasmids separately into E. coli BL21, and added continuously expressed mCherry as an internal control (as shown in Figure 3A). We set up a control group (where the addition of water prevented the black box from activating mutations), a second-generation black box group (where IPTG was added and placed under blue light to activate mutations), and a first-generation black box group (where IPTG was added to activate mutations). We then performed fluorescence-activated cell sorting (FACS) on the three groups based on the fluorescence intensities of GFP and mCherry, and sorted out the cell populations with increased and decreased fluorescence intensities (as shown in Figure 3B). The results (as shown in Figure 3C) revealed that the proportion of cells with increased fluorescence intensity in the second-generation black box group (14.78%) was significantly higher than in the first-generation black box group (3.42%) and the control group (11.95%). This indicates that the second-generation black box significantly enhanced the mutation frequency compared to the first-generation black box, and the majority of the mutations were C-to-T substitutions (as shown in Figure 3D). We inferred that the significant mutations caused by the second-generation black box may have substantially altered the aggregation ability of the GFP, leading to the changes in fluorescence intensity (as shown in Figure 3E). Here, we concluded that GFP with mutations T10I, S29F, A155V are the top ranked which may accelerate the combination and assembly between GFP proteins.


Figure3. Differences in mutation frequencies caused by the second-generation black box (CBE-Mag-RNAP) and the first-generation black box (CBE-RNAP). A, Plasmids of the second-generation black box and the first-generation black box. B, Fluorescence-activated cell sorting (FACS) based on the fluorescence intensities of GFP and mCherry. C, Proportion of cells with increased and decreased fluorescence intensity in the three groups. D, Statistical analysis of the proportions of different types of random mutations. E, Schematics of the mutated GFP T10I, S29F, A155V . Gray part refers to the wild type while green part refers to the mutant.

4. High throughput screen for "gain-of-function" mutations on metabolic enzymes in E. coli by CBE-RNAP

After confirming that the second-generation black box could successfully enhance the frequency of random mutations, we applied the second-generation black box to improve the pentose phosphate pathway (PPP) and polyamine synthesis pathway in E. coli. By randomly mutating and screening the key enzymes in these metabolic pathways (as shown in Figure 4A), we hypothesized that we could increase the production of the anti-aging compound spermidine and the production of antioxidant NADPH in E. coli. Therefore, we constructed plasmids for spermidine synthesis (spermidine synthase - S-adenosylmethionine decarboxylase - agmatinase) and NADPH synthesis (6-phosphogluconolactonase - transaldolase - transketolase), and co-transformed them with the second-generation black box plasmid into E. coli BL21 (as shown in Figure 4B). We set up a control group (where the addition of water will not activate the black box) and a second-generation black box group (where the black box was activated by IPTG and placed under blue light). After the second-generation black box randomly mutated the spermidine synthesis plasmid and NADPH synthesis plasmid, we performed sequencing analysis on the plasmids from the control and experimental groups to verify the success of the mutations. The results (as shown in Figure 4C) revealed that, compared to the control group, the experimental group exhibited a clear C-to-T substitution, indicating the successful mutation of the two plasmids by the second-generation black box, with the majority being C-to-T mutations (as shown in Figure 4D). We then screened the mutated E. coli and tested them on Caenorhabditis elegans (C. elegans) in LB culture medium. Since spermidine and can delay aging and NADPH can scavenge reactive oxygen species (ROS), we recorded the survival times of the C. elegans, which were cultured on the LB culture mediums with the mutated E. coli (experimental group) and the unmutated E. coli (control group). The results (as shown in Figure 4E) demonstrated that the average lifespan of the C. elegans. in the experimental group was significantly longer than the control group, indicating that the second-generation black box successfully enhanced the production of spermidine and NADPH in E. coli, ultimately leading to the extended lifespan of C. elegans.


Figure4. Application of the second-generation black box on the pentose phosphate pathway (PPP) and polyamine synthesis pathway in E. coli. A, Pentose phosphate pathway (PPP) and polyamine synthesis pathway E. coli. B, Plasmids for spermidine synthesis (spermidine synthase - S-adenosylmethionine decarboxylase - agmatinase) and NADPH synthesis (6-phosphogluconolactonase - transaldolase - transketolase). C, Sequencing results of plasmids from the control and experimental groups. D, Statistical analysis of the proportions of different types of random mutations. E, Statistical analysis of lifespans of the control and experimental groups.

5. Protein structure prediction for "gain-of-function" mutations on metabolic enzymes in E. coli by CBE-RNAP

To further explain the reasons for mutations that can provide evidence for the production of more NADPH and spermidine, we performed protein structure analysis on each mutation detected through next-generation sequencing. As shown in Figure 5A, the mutations A66T and A26T in spermidine synthase may enhance the efficiency of converting putrescine to spermidine. Considering systems biology, the A66T and A26T mutations in spermidine synthase, along with the G12D and T60I mutations in S-adenosylmethionine decarboxylase, can collectively support the entire metabolic system in generating more spermidine. More importantly, the R244C mutation in 6-phosphogluconolactonase, along with the A31V, A159V, and A314V mutations in transaldolase, as well as the E85K and H47Y mutations in transketolase, represent ideal mutations that could work synergistically to improve the metabolic system's capacity to generate more NADPH (as shown in Figures 5B-E). Here, we aim to propose a systematic hypothesis linking bio-brick mutations to final functions. Much like playing the piano, where not every key needs to be struck forcefully, the collaboration of these mutations may result in more harmonious and efficient outcomes.


Figure5. Structure based analysis of mutations in improving NADPH and Spermine synthesis.

6. High throughput screen for "gain-of-function" mutations on degradation of Nicotine in E. coli by CBE-RNAP

Considering the toxicity of nicotine and the fact that nicotinate is the substrate for generating nicotine, we constructed a bio-brick containing three enzymes: nicotinate dehydrogenase, 6-hydroxynicotinate reductase, and enamidase (Figure 6A). We applied the second-generation black box approach to enhance the degradation of nicotinate in E. coli. By randomly mutating and screening key enzymes in these metabolic pathways (as shown in Figure 6B), we hypothesized that we could increase the degradation of nicotinate on an LB plate. We then performed Sanger sequencing on nicotinate dehydrogenase, resulting in a successful mutation, as shown in Figure 6C. However, Sanger sequencing did not reveal mutations with low frequencies. We subsequently conducted next-generation sequencing on bacteria with multiple mutations. As shown in Figure 6D, the next-generation sequencing results indicated a significant increase in C to T mutations in the nicotinate dehydrogenase gene. To further confirm that these mutations in E. coli can protect C. elegans from high concentrations of nicotinate, we performed a lifespan assay, demonstrating that the mutant E. coli significantly protected C. elegans (Figure 6E).


Figure6. Application of the second-generation black box on the nicotinate degradation pathway in E. coli. A, Nicotinate degradation pathway in E. coli. B, Schematic processing. C, Sequencing results of plasmids from the control and experimental groups. D, Statistical analysis of the proportions of different types of random mutations. E, Statistical analysis of lifespans of the control and experimental groups.

7. High throughput screen for "gain-of-function" mutations on degradation of Nicotine in E. coli by CBE-RNAP

To further explain the top mutations that improve nicotinate degradation, we performed protein structure analysis using AlphaFold3. As shown in Figure 7, the P122S mutation in nicotinate dehydrogenase, along with the A32V mutation in hydroxynicotinate reductase, are beneficial compared to the non-mutated group, indicating functional mutations that enhance the system's ability to degrade nicotinate.


Figure7. Structure based analysis of mutations in improving Nicotine degradation.