We follow the design-build-test-learn(DBTL) cycle to progressively advance our project. In summary, our project can be divided into the following four cycles: expression and purification of nsp5, expression, purification, and enzymatic activity characterization of nsp5 with natural N- and C-termini, development of an in vivo inhibitor screening platform based on FlipGFP, and rational design of nsp5 to enhance its enzymatic activity.
Due to our ultimate goal of achieving in vivo screening of nsp5 inhibitors, our initial goal is to successfully express functionally active nsp5 protein in an E. coli host by optimizing the expression conditions and using an appropriate purification strategy. To achieve this goal, we first designed an expression vector for the production of SARS-CoV-2 nsp5. To ensure soluble expression of the protein, we selected the pGEX-6P-1 vector, which includes a GST tag and an HRV 3C protease cleavage site. For purification, we fused a 6*His tag to the C-terminus of the nsp5 sequence while retaining the HRV 3C protease cleavage site for subsequent removal of the tag(Figure 1).
We first successfully amplified the vector backbone and the nsp5-6*His tag separately using PCR (Figure 2B). Subsequently, we constructed the pGEX-GST-nsp5-His through homologous recombination. The sequencing results confirmed the correct construction of our vector (Figure 2C).
We expressed the protein in E. coli BL21 and purified it using Ni-NTA affinity chromatography. First, E. coli BL21 cells transformed with the pGEX-GST-nsp5-His plasmid were cultured at 37°C until the OD reached ~0.6. Then, protein expression was induced by adding IPTG to a final concentration of 0.2 mM. After 16 hours of induction, the cells were lysed, and the supernatant was collected by high-speed centrifugation (13,000 xg). The supernatant was passed through a Ni-NTA affinity column to allow the target protein to bind to the Ni-NTA beads. Next, HRV 3C protease was added to remove the excess amino acids from the N- and C-termini of the target protein, separating it from the Ni-NTA beads. Finally, the purified SARS-CoV-2 nsp5 was eluted and concentrated for further analysis. SDS-PAGE analysis showed that the purified and cleaved protein had high purity. However, the band appeared around 70 kDa(Figure 3), which differs from the expected molecular weight of nsp5 (33.8 kDa).
During this cycle, we found that the molecular weight of the purified protein differed from the expected value, suggesting that the protein obtained through this expression and purification method might be in a non-native state, implying incorrect folding or impaired activity.
After carefully reviewing the clone, we discovered that the theoretical molecular weight of the GST tag + nsp5 was approximately 60 kDa, which was close to the observed band position. This indicated that the larger-than-expected molecular weight was likely due to the incomplete removal of the GST tag. Since there were no redundant amino acids between the HRV 3C protease cleavage site and the nsp5-GST tag, we hypothesized that steric hindrance at the cleavage site was preventing HRV 3C protease from accessing the site and cleaving the GST tag. In contrast, the C-terminal His tag did not have this issue, and our protein structure prediction modeling(Figure 4) confirmed this observation (see Model for details). Therefore, according to this vector design, we obtained nsp5 with a native C-terminus but a redundant GST tag at the N-terminus.
To address this issue, we proposed two solutions. One approach was to introduce a flexible GGGS linker upstream of the nsp5 sequence to expose the cleavage site. However, this would still leave redundant amino acids at the N-terminus of nsp5. The other approach was inspired by the SARS-CoV-2 lifecycle[1], where nsp5 must undergo self-cleavage at the N-terminus to release itself from the polyprotein during normal viral processing. Therefore, we considered introducing the self-cleavage sequence of nsp5 at its N-terminus (referred to as nsp5_native) to allow the N-terminal GST tag to be self-cleaved after protein expression, resulting in SARS-CoV-2 nsp5 with both native N- and C-termini.
Based on previous experience, we introduced four amino acids (AVLQ) at the N-terminus of nsp5 to facilitate its self-cleavage from the GST tag. The rest of the sequence remained consistent with the pGEX-GST-nsp5-His construct(Figure 5). This new construct is designated as pGEX-GST-nsp5_native-His.
Following the same vector construction method as in round 1, we successfully constructed pGEX-GST-nsp5_native-His and confirmed its accuracy through sequencing analysis(Figure 6B).
Following the same purification steps as in round 1, we successfully purified nsp5_native. SDS-PAGE indicated that nsp5_native had high purity and a molecular weight consistent with expectations. We subsequently measured the efficiency of nsp5_native in cleaving a fluorescent substrate using FRET, and characterized the kcat/Km of nsp5_native as 27,691 s⁻¹M⁻¹(Figure 7) based on the Michaelis-Menten equation(see Model for details). From the SDS-PAGE and FRET results, we concluded that nsp5_native possesses the natural N- and C-termini of SARS-CoV-2 nsp5.
After two rounds of the DBTL cycle, we have confirmed that it is possible to synthesize SARS-CoV-2 nsp5 with a native N-terminus and C-terminus, as well as correct functionality, in E. coli BL21 using the existing expression strategy. This lays a solid foundation for the subsequent construction of the in vivo inhibitor screening platform. Furthermore, we noticed that nsp5 exhibits extremely high activity when cleaving substrates, prompting us to consider other potential applications for nsp5. The protein purification steps indicate that the removal of the recombinant tag is a key part of the entire purification process. nsp5's enzymatic activity is nearly 30 times higher than that of the commonly used TEV protease and HRV 3C protease for tag removal, suggesting significant potential for nsp5 in this application. Therefore, introducing some mutations to enhance nsp5's enzymatic activity could make it a more effective tool enzyme.
The in vivo screening platform has advantages such as avoiding the cumbersome process of protein purification, being closer to physiological conditions, and reflecting drug toxicity. In this round of the project, we primarily aimed to design an in vivo nsp5 inhibitor screening platform. To achieve this, we selected FlipGFP[2], a fluorescent protein that responds to proteases, as our reporter gene. Under ideal conditions, FlipGFP does not emit fluorescence when expressed alone, but it fluoresces upon recognition and cleavage by a protease (see Description for details). Therefore, the intensity of fluorescence can indicate the activity of the protease.
To ensure the correct expression of FlipGFP, we used the pRSF-Duet1 and separately inserted the first nine β-strands of FlipGFP and the engineered 10-11 β-strands into two different ORFs. This vector is referred to as pRSF-FlipGFP_nsp5(10-11)-FlipGFP(1-9). To facilitate inhibitor screening, we designed a genetic circuit with the following functions(Figure 8): when only FlipGFP is present, the entire system does not emit fluorescence; when both nsp5 and FlipGFP are present, nsp5 cleaves FlipGFP, resulting in fluorescence; and when nsp5 is inhibited, the cleavage efficiency of nsp5 on FlipGFP decreases, leading to reduced or absent fluorescence.
Following the same vector construction method as in round 1, we successfully constructed pRSF-FlipGFP_nsp5(10-11)-FlipGFP(1-9) and confirmed its accuracy through sequencing (Figure 9B).
We separately transformed E. coli BL21 with FlipGFP alone and co-transformed BL21 with FlipGFP and nsp5, then plated them on LB plates containing IPTG at a final concentration of approximately 0.2 mM. The results showed that colonies transformed with FlipGFP alone produced almost no fluorescence, while colonies co-transformed with FlipGFP and nsp5 emitted significant fluorescence. This demonstrates that we successfully constructed a FlipGFP system that can be activated by nsp5 to emit fluorescence (Figure 10).
We successfully constructed a FlipGFP system that can be activated by nsp5, confirming the great potential of our platform for inhibitor screening. However, for practical applications in the future, there are several points for improvement:
Since FlipGFP can respond to other proteases with only minor modifications to the recognition sequence, this indicates that our designed platform also possesses good scalability, allowing it to be adapted for screening other protease inhibitors with minimal changes. Considering the importance of proteases in the lifecycle of all positive-strand RNA viruses, we believe this platform can be quickly utilized for screening new inhibitors in the event of future outbreaks of other positive-strand RNA viruses. Compared to methods that require protein purification for inhibitor screening, our platform is more flexible and efficient.
TEV protease[3] and HRV 3C protease[4] are commonly used tool enzymes for the removal of recombinant tags. Compared to them, nsp5 has higher enzymatic activity, suggesting that nsp5 may be better suited for this purpose. In this cycle, we aimed to enhance the enzymatic activity of nsp5 through rational design to improve its efficacy in removing recombinant tags. To achieve this, we first predicted the structure of the nsp5-substrate complex. We hypothesized that increasing the interaction strength between the substrate and nsp5 could enhance enzymatic efficiency. Consequently, we designed a mutation, T21I, that might improve enzyme activity. Structural predictions indicated that this mutation increases the interaction between the substrate and the 21st amino acid of nsp5, making it a potential variant with enhanced activity (Figure 11). We used the same vector design as in the second cycle, naming the vector pGEX-GST-nsp5_T21I-His.
In order to introduce the mutation at the 21st amino acid, we used the point mutation method to construct the vector . We designed a pair of primers at the corresponding position in the DNA sequence and replaced the codon corresponding to T with the codon corresponding to I in the DNA sequence by reverse PCR. We then transformed this linear sequence into E. coli DH5α and used its own repair mechanism to obtain the vector pGEX-GST-nsp5_T21I-His. We confirmed its accuracy through sequencing analysis(Figure 12B).
Following the same purification steps as in round 1, we successfully purified nsp5-T21I. SDS-PAGE indicated that nsp5-T21I had high purity and a molecular weight consistent with expectations(Figure 13).This suggests that the nsp5-T21I has native N- and C-termini.
We subsequently measured the enzymatic activity of nsp5-T21I using FRET, and characterized the kcat/Km of nsp5-T21I as 35,069 s⁻¹M⁻¹ based on the Michaelis-Menten equation(see Model for details). From FRET results, we confirmed that the nsp5 T21I variant indeed possesses higher enzymatic activity compared to the wild-type nsp5(Figure 14).
We successfully utilized rational design to enhance the enzymatic activity of nsp5, making it better suited for the removal of recombinant tags and demonstrating its potential as a tool enzyme. Our results confirm that increasing the affinity between nsp5 and its substrate contributes to enhanced activity, providing important references for further design and optimization of nsp5. In the future, we can explore additional mutations at various sites and combinations to further boost the enzymatic activity of nsp5, ultimately aiming to develop it into a universal tool enzyme for recombinant tag removal.