Engineering | USP-Brazil

Design

Your project involves three main areas, each with interconnected tasks to achieve efficient glycoprotein production:

Saccharomyces cerevisiae Engineering
- Adaptation of the glycosylation pathway: Modify the yeast’s glycosylation machinery to mimic the human pathway.
- Accumulation of Man3GlcNAc2: Ensure this glycan accumulates inside the endoplasmic reticulum (ER).
- Glycan transfer: Ensure that the glycan is correctly transferred to the target glycoprotein.
Escherichia coli Engineering
- Creation of a glycosylation pathway: Introduce the necessary enzymes to build a glycosylation system in E. coli.
- Accumulation of Man3GlcNAc2: Ensure glycan accumulation in the periplasmic space.
- Glycan transfer: Ensure the glycan is efficiently transferred to the desired protein.
Glycoprotein Production
- Solubility: Produce glycoproteins in a soluble form.
- Enzymatic activity determination: Assess the activity of the produced glycoprotein to confirm functionality.
- Phagocytosis and localization: Test the ability of macrophages to phagocytize the glycoprotein and its delivery to lysosomes.

The project's success relies on the precise integration of these components. For example, for glycosylation of human GCase, it must be directed to the appropriate cellular compartment, achievable by fusing signal peptides to guide it correctly. This integrated approach ensures that your engineered E. coli and S. cerevisiae models can efficiently produce human-compatible glycoproteins.

Saccharomyces cerevisiae engineering

The budding yeast Saccharomyces cerevisiae is an excellent choice for this iGEM project, since it’s a well-known eukaryotic model with lots of advantages for synthetic biology – and it has an endogenous glycosylation pathway!

Like in other eukaryotes, the glycosylation pathway in S. cerevisiae has similar initial steps that make the common core carbohydrate

However, while the next steps in the human glycosylation pathway make subtle modifications, in yeast there is the generation of a hypermannosylated glycoprotein – which is immunogenic - Hence, we must modify this glycosylation pathway.

The deletion strategy for modifying the glycosylation pathway in Saccharomyces cerevisiae focuses on three key genes: OCH1, Alg11, and Alg3, chosen based on their roles in glycan extension.

OCH1 Deletion: This is our first target, as OCH1 initiates the hypermannosylation of the glycan. Deleting it prevents the glycan from extending into hypermannose structures and reduces unspecific mannose additions. This is a crucial step for retaining a simpler glycan profile compatible with the glycan of interest.
Alg3 Deletion: After OCH1, Alg3 is targeted because it adds mannose residues to the α6-mannose of the Man5GlcNAc2 structure. Deleting Alg3 should allow the glycan to accumulate as Man5GlcNAc2 in the endoplasmic reticulum, which is closer to the desired structure.
Alg11 Deletion: This gene, involved in adding mannose residues to the cytoplasmic side of the ER, is deleted last. Since the flippase transfers the Man5GlcNAc2 to the ER lumen, removing Alg11 might interfere with this flipping process. If this proves problematic, the double mutant ΔOCH1ΔAlg3 will be used instead.

This stepwise approach minimizes potential metabolic burden and growth impairment while guiding S. cerevisiae toward producing the simpler Man3GlcNAc2 glycan.

Figure 1. Strategy for engineering Saccharomyces cerevisiae for the production of the polysaccharide of interest.

At this stage, we anticipate two main challenges:

Potential Inhibition of Flippase Activity by Alg11 Deletion: The deletion of Alg11 might interfere with the flippase's ability to recognize its substrate, Man5GlcNAc2. This could lead to inefficient translocation of the glycan to the lumen of the endoplasmic reticulum, disrupting the glycosylation process.
Nonspecific Glycosylation by Other Transferases: Even with the deletion of the target genes, there remains a risk that other glycosyltransferases in S. cerevisiae might nonspecifically add unwanted mannose residues to our glycan structure.

To address these issues, we plan to introduce the α-1,2 mannosidase from Trichoderma reesei in our system. This enzyme will help trimming any excess mannose residues from Man5GlcNAc2 down to Man3GlcNAc2. By fusing an HDEL tag in the C-terminal region of this protein, we will ensure that this mannosidase remains in the endoplasmic reticulum lumen, where it can efficiently perform its function and maintain the desired glycan structure. IDT was essential for the creations of the synthetic gene coding for our codon-optimized α-1,2-mannosidase.

Figure 2. α-1,2-mannosidase activity over Man5GlcNac2, trimming it down to Man3GlcNac2

Once engineered, our yeast will be challenged to produce the selected protein, the human glucocerebrosidase (GCase). We have developed two cloning strategies for GCase expression using the shuttle vector YIp352, designed for overexpression under the control of the constitutive TEF and GPD promoters. To ensure proper targeting to the endoplasmic reticulum, an ER signal peptide tag was added to the N-terminal of GCase (yellow region), directing its delivery. Once the system is fully assembled, we will express GCase and evaluate its glycosylation efficiency, enzymatic activity, and other relevant criteria to confirm successful glycoprotein production in our engineered yeast model. Additionally, a step-tag was added to the C-terminal of GCase to facilitate its purification.

Escherichia coli engineering

Escherichia coli is currently the most widely used model organism for heterologous protein expression. Its simple prokaryotic cellular architecture, along with well-characterized metabolic pathways, has established E. coli as a key chassis for cloning and various biotechnological applications. However, when it comes to glycoproteins, E. coli presents several limitations.

In contrast to eukaryotes, where post-translational modifications are common — such as in humans, where an estimated 50% to 70% of proteins are glycosylated — N-glycosylation in bacteria is a rare occurrence. Specifically, E. coli lacks an intrinsic glycosylation system, which poses a significant challenge for the heterologous expression of human proteins. As a result, this often leads to the production of insoluble, misfolded proteins that aggregate into inclusion bodies.

To give E. coli the ability to synthesize our glycan, we focused on reconstructing the glycosylation assembly pathway on the cytoplasmic face of the inner membrane, given that some glcsyltransferase require their transmembrane domains for proper activity. We leveraged the native E. coli WecA protein, which can transfer GlcNAc from the precursor UDP-GlcNAc to the lipid-linker undecaprenol pyrophosphate embedded in the membrane. For the next step, the addition of the second GlcNAc residue is performed by the eukaryotic proteins Alg13 and Alg14.

The assembly continues with Alg1 adding the first mannose, followed by Alg2, which adds the remaining two terminal mannose residues to form Man3GlcNAc2. Once the glycan structure is complete, it will be flipped from the cytoplasmic side to the periplasmic space by Wzx, a native E. coli flippase. Wzx normally flips the precursor sugars used in LPS synthesis, but in this engineered system, it will be useful to invert the direction of the Man3GlcNAc2 glycan, making it available for glycoprotein production in the periplasmic space.

Figure 3. Strategy for engineering Escherichia coli for the production of the polysaccharide of interest in the inner membrane.

To enhance substrate availability for our glycosyltransferases, we will upregulate the expression of the ManB and ManC enzymes. These enzymes are crucial for the biosynthesis of UDP-mannose, which serves as a key precursor for glycan assembly. Additionally, we will knock out the GMD gene (GDP-mannose dehydratase), which is involved in mannose catabolism, to prevent the degradation of mannose and ensure an increased pool of the precursor available for glycosylation processes.

Initially, we aimed to integrate the glycosyltransferase genes into an operon and insert it into the E. coli chromosome at the IS5 intergenic region using the lambda red recombineering technique. However, considering the limited time for wet lab experiments this year, we opted for a more practical approach by cloning the entire operon into the pRSFDuet-1 plasmid. We placed the operon under the control of the low expression constitutive J23109 promoter (iGEM Part:BBa_J23109).

Figure 4. Empty pRSFDuet-1 map. Image created using the SnapeGene software.

Despite choosing plasmid expression for now, we preserved the homology regions in our construct to facilitate future chromosomal integration in the second year of the project. To use this construct, we designed primers that would amplify the operon, excluding the homology regions, allowing us to insert it into the pRSFDuet-1 vector via Gibson assembly. Additionally, we plan to perform PCR on the pRSFDuet-1 plasmid to remove the original T7 promoter, ensuring that only the J23109 promoter regulates the expression of our glycosylation operon.

Figure 5. Final construct containing the genes for precursor synthesis and glycan assembled. All of these genes were codon-optimized for E. coli BL21 DE3 before being sinthesized by IDT. Image created using the SnapGene software.

The final construct would consist of one transcriptional unity containing the four glycosyltransferase genes, Alg13, Alg14, Alg1 and Alg2, followed by the ManB and ManC genes.

Figure 6. Final cloned plasmid containing the whole operon for precursor synthesis and glycan assembly. Image created using the SnapGene software.

Before cloning the GCase into the plasmid, we needed to ensure that our enzyme could undergo glycosylation properly. Since our system directs the final glycan to the periplasm, the enzyme must be targeted to this cellular space. While expressing our protein in the periplasm might reduce overall yield, this strategy is advantageous for complex human protein, which contains eight cysteine residues (four of them forming two disulfide bonds). Expressing it in the periplasm avoids the reducing environment of the bacterial cytoplasm, which can compromise the formation of disulfide bonds. This strategy is beneficial for other human proteins with similar disulfide bond requirements.

To direct the enzyme to the periplasm, we fused a PelB signal peptide to its N-terminal region. The PelB tag was chosen because it is recognized by the Sec system, which transports unfolded proteins to the periplasm. This choice is crucial, as the oligosaccharyltransferases (OSTs) in our system only recognize unfolded proteins, making the Tat system (which transports folded proteins) unsuitable. Additionally, we added a Strep-tag to the C-terminal region of the protein to facilitate its purification.

Figure 7. Glycosylation pathway in the periplasm of the engineered E. coli.

At this point, it would be necessary to prepare the second plasmid containing the GCase enzyme and the olygosaccharyltransferases. pETDuet-1 was chosen due to its replication origin compatible with pRSFDuet-1, and also by having two cloning sites. GCase was cloned into the first multiple cloning site (MCS) using gibson assembly. Once prepared, the second MCS must be used to integrate the olygosaccharyltransferases chosen for this work. by creating different combinations, we could finally evaluate what of these enzymas are better for this purpose.

Figure 8. Strategy for creating different combinations of the human β-Glucocerebrosidase (GCase) with various oligosaccharyltransferases (OSTs). OST1, OST2, and OST3 correspond to the OSTs from the three selected archaea. Additionally, OST4 and OST5 (represented as OST5a and OST5b) are from the low-order eukaryotes Leishmania major and Trypanosoma brucei, respectively. Although the low-order eukaryotic OSTs will not be utilized in this year’s project (we reached the maximum synthesis provided by IDT and Twist) the constructions are prepared and will be tested in the project's second year.

Finally, toconcluce our work on GCase production, we will initially focus on evaluate the protein expression. Our proof of concept for this year centers on engineering our system to yield soluble GCase, marking a significant achievement, given that past attempts to express GCase in E. coli resulted in insoluble protein formation (https://doi.org/10.1007/s12033-010-9303-4). Moving forward into the second year, we aim to assess our GCase against commercially available enzymes. The evaluation will include its efficiency in macrophage phagocytosis, lysosomal delivery, and enzymatic activity, providing a comprehensive comparison to existing treatments.

Engineering Success

Considering that our project consists of many circuits that must be intricately connected to function as intended, the design aspect was the most essential focus this year. The Design and Learn process was consistently integrated throughout the year. Each individual part required thorough study, both in isolation and within the context of the overall system.

Construction of the engineered yeast

The strategy for engineering Saccharomyces cerevisiae focused on creating knockouts to interrupt the glycan extension pathway and inserting enzymes to prevent unwanted extensions of our sugar.

Construction of the engineered bacteria

In contrast to S. cerevisiae, Escherichia coli lacks a glycosylation pathway. Therefore, it is necessary to construct the entire glycan assembly system from scratch. This presented a significant challenge, as it required optimizing the intricate mannose degradation pathway and adding the necessary enzymes to create the desired polysaccharide. Moreover, the entire strategy for sugar transfer must be meticulously designed, as most oligosaccharyltransferases (OSTs) are unsuitable for this task due to their substrate specificity. A wide range of enzymes was proposed for this step, including those from eukaryotic cells (such as yeast, Trypanosoma, and Leishmania), other bacteria (Haemophilus influenzae and Campylobacter jejuni), and finally, archaea (Methanococcus voltae, Methanthermus fervidus, and Sulfolobus acidocaldarius). Even though the discovery of archaea enzymes seemed to address the problem, these microorganisms produce entirely different glycan structures. As a result, most of these enzymes are not applicable to our final product, as they would not recognize our substrate. For this reason, we selected the most promising enzymes to be tested in our construction to evaluate whether they can successfully transfer the sugar to the protein of interest.

A problem we aimed to anticipate was identifying the proteins from E. coli that could be glycosylated as a result of constructing an artificial glycosylation pathway. The addition of polysaccharides to human proteins is essential for proper folding and activity. However, bacteria have evolved to produce proteins that function without glycan addition. Consequently, the unwanted transfer of glycans to E. coli proteins could potentially impair their function and may even compromise bacterial growth. To address this, we decided to explore the potential of these oligosaccharyltransferases (OSTs) by expressing them under an inducible promoter, synthesizing them only after E. coli has grown, specifically during the production of our proteins to be glycosylated.

Another experiment we conducted involved molecular dynamics to investigate protein stability in the context of expressing GCase using the current bacterial glycosylation mechanism via PglB. In this case, our protein must be modified to carry the bacterial D/E - X - N - X - S/T glycosylation sequon. Our results indicate that GCase remains structurally stable even after the addition of five negatively charged residues. Further experiments, such as assessing enzymatic activity and in vivo function, must be conducted to understand the overall impacts of these mutations. Considering that GCase is a small monomeric protein, we plan to perform molecular dynamics experiments using larger structures with more glycosylation sites and higher oligomeric forms, such as the Spike protein from SARS-CoV-2, which harbors 22 N-glycosylation sites. This will help us identify the limitations of protein adaptation when utilizing the current bacterial system.

Potential glycoproteins for the proof of concept.

As discussed earlier, this project was initiated to engineer E. coli and S. cerevisiae for the production of glycoproteins, reflecting the limitations faced in viral protein production during the 2020 pandemic using available microbial models. Initially, we aimed to work with the Spike protein from SARS-CoV-2 to evaluate its folding and oligomerization when produced by our engineered microorganisms. However, this idea was modified to focus solely on the receptor-binding domain (RBD), which has only two glycosylation sites, making it easier to work with. Nevertheless, since we wanted to produce the central core sugar, we realized it might not be comparable to the natural protein.

Consequently, we considered various wild-type and adapted proteins for this part of the project, including human immunoglobulins (IgGs), human hormones, and an artificial maltose-binding protein fused with a glycosylation tag. However, we were not satisfied with these options due to their inadequate glycosylation compared to natural proteins. After extensive literature searches and discussions with various professors, we identified human glucocerebrosidase (GCase) as the ideal candidate for our project for several reasons:

Glycosylation Sites: GCase has five N-glycosylation sites, of which four are naturally glycosylated. This allows us to compare our microbial model to other eukaryotic systems, such as cell cultures.
Solubility in E. coli: GCase is insoluble in E. coli when not properly glycosylated. Only by producing the soluble GCase can give us an insight that our system is properly working.
Terminal Mannose Residues: This enzyme requires glycosylation with terminal residues of mannose. These exposed sugars signal macrophages to phagocytose the enzyme, allowing its delivery to lysosomes where it acts on sphingolipids.
Clinical Relevance: GCase is currently used for the treatment of Gaucher’s disease, allowing us to compare our expression products in yeast and bacteria with three commercial options (imiglucerase, velaglucerase, and taliglucerase), all of which are produced using plant or cell tissue cultures.
Cost-Effective Production: The production strategies for GCase are currently extremely expensive. In Brazil, this enzyme is provided to patients through the public health system (Sistema Único de Saúde - SUS). Proposing an alternative strategy for GCase production in S. cerevisiae or E. coli has the potential to significantly reduce production costs and alleviate the financial burden on our country, which, despite the rarity of the disease, incurs annual costs between R$ 300,000,000 to R$ 500,000,000 reais through SUS.

Due to the extensive design phase of our project, the experimental work could only commence later than anticipated. However, we successfully included the computational analysis and initiated the yeast cloning step on our wiki. As of now (01/10/2024), we are still waiting for the synthetic genes from IDT to arrive, which will allow us to finally begin the bacterial component of our project. We hope to present new results during the judging session. We believe this project has the potential to pave the way for glycoengineering in iGEM, and we aspire for our work to inspire other teams to develop innovative strategies for the cloning and expression of glycoproteins.