Project Description

Project Description USP-Brazil 2024

Glycan profiles
Figure 1. Glycan profiles in mammalian, plant, insect, archaeal, Saccharomyces cerevisiae and Escherichia coli proteins

Eukaryotes have evolved the ability to modify their proteins by linking polysaccharide groups to asparagine residues, with these sugars being essential for the proper folding and activity of glycoproteins. In humans, more than 50% of proteins are glycosylated, rising up to 70% in certain tissues. The complex pathways required for the assembly and transfer of glycans to proteins pose significant challenges when adapting these processes to prokaryotic models, such as Escherichia coli, the most commonly used organism for heterologous protein production. In contrast, Saccharomyces cerevisiae possesses a functional machinery for glycoprotein production. However, it predominantly produces hyper-mannose-type glycosylation, which is immunogenic in humans, making it unsuitable for the production of therapeutic glycoproteins.

Why Glycoproteins?

In addition to humans, human viruses can benefit from the glycosylation machinery of host cells to modify their proteins, a critical feature for completing their infection cycles. This glycosylation aids in evading the host immune response and enhancing viral entry and replication. During the 2020 pandemic, Prof. Dr. Cristiane Rodrigues Guzzo adapted the projects in her research group to collaborate with the Brazilian ReveVirus MCTI program. Together, they expressed and purified proteins from SARS-CoV-2 using Escherichia coli, distributing these proteins to other research groups. This initiative created a collaborative network of researchers focused on understanding the physiology of these proteins and their potential applications in developing diagnostic tests.

However, not all viral proteins can be adequately expressed in E. coli due to its lack of glycosylation machinery, which affects protein folding. For instance, the Spike protein that mediates membrane fusion and virus entry in cells is insoluble in E. coli, making it impossible to produce by this system. This way, the Spike protein could not be distributed for other groups to study.

Additionally, during a discussion with Prof. Dr. Luís Carlos de Souza Ferreira, who focuses on developing vaccines against human viruses, we discovered that this issue is longstanding. The absence of a glycosylation pathway in E. coli limits the production of these proteins to eukaryotic cells, leading to increased costs and requiring long incubation periods to produce small amounts of proteins.

Figure 2. Glycoprotein folding scheme.

Motivation: Engineering E. coli and S. cerevisiae to produce human glycoproteins

Glycosylation in eukaryotes begins with the assembly of the glycan on the cytoplasmic surface of the endoplasmic reticulum (ER). After forming the Man5GlcNAc2 structure, the glycan is flipped into the lumen of the ER, where it undergoes further modification by acquiring additional sugars. This modified glycan is then transferred to the target protein by oligosaccharyltransferases. Once glycosylated, the protein is delivered to the Golgi apparatus via vesicles, where it receives the final sugar additions before being secreted.

Glycosylation pathway in eukaryotes
Figure 3. Glycosylation pathway in eukaryotes. Glycan assembly and transference in the endoplasmic reticulum, delivery, and final modifications in the Golgi apparatus. Source: https://doi.org/10.3389/fpls.2024.1349064

In contrast, prokaryotic cells lack organelles like the endoplasmic reticulum and Golgi apparatus. However, some species have evolved glycosylation machinery that assembles glycans on the cytoplasmic face of the inner membrane. Once synthesized, these glycans are recognized and flipped to the periplasm, where they can be transferred en bloc to the target protein by an oligosaccharyltransferase. This feature is shared among both bacteria and archaea.

Figure 4. Glycosylation pathway in Campylobacter jejuni. Source: https://doi.org/10.1021/ac9013622

Compared to cell tissue, insect, and plant expression models, which take around three weeks to complete the production and purification cycle of small protein quantities at elevated costs due to complex media requirements, E. coli and S. cerevisiae are fast-growing microorganisms widely used in biotechnological processes. These organisms can be used to produce and purify large amounts of proteins in just three to five days. Additionally, they can be cultivated in low-cost media and are not susceptible to human viruses that can contaminate cultures. Adapting both organisms for the adequate production of glycoproteins has the potential to significantly enhance the global production of these valuable products.

To achieve this, we must decide which glycoform we aim to produce. Initially, we wanted to adapt these organisms for assembling human-type polysaccharides with hybrid or complex glycans. However, we realized that the pathways involved in preparing precursors, assembling sugars, and transferring them to proteins are extremely complex. Then, what kind of glycan is simple to work within two years and would interest companies and academic research groups? The answer lies in the core structure, Man3GlcNac2!

Figure 5. Structure of Man3GlcNac2 and conservation among different organisms.

Man3GlcNAc2 is a simple polysaccharide produced in the early stages of glycan assembly on the outer surface of the endoplasmic reticulum. Furthermore, only four proteins are required to synthesize this glycan, making it feasible to reconstruct the system in E. coli. Additionally, yeasts possess the necessary machinery to produce this structure, which can be achieved through genome knockout of genes involved in glycan extension by adding new mannose residues.

Another important characteristic in selecting these sugars is that they represent the conserved core shared among all the aforementioned organisms. This foundational aspect allows for future research groups to reengineer our Saccharomyces cerevisiae and Escherichia coli systems to produce the desired type of glycosylation. By adding new glycosyltransferases to the pathway — such as those found in the Golgi apparatus, which are capable of inserting additional sugars into the already glycosylated proteins — researchers can assemble the glycosylation profiles to meet specific therapeutic or research needs. This flexibility paves the way for further advancements in glycoprotein production and engineering.

To engineer our organisms, two completely different strategies were developed; ‘. Adaptation of the Existing Glycosylation Pathway in S. cerevisiae: This approach aims to ensure that the yeast accumulates our glycan of interest while avoiding the common challenge of engineering yeasts for glycosylation - the unintended addition of excess mannose residues due to the nonspecific activity of various mannosyltransferases involved in extending to hyper-mannose types.; and 2. Creation of a De Novo Glycosylation Pathway in E. coli: Since E. coli lacks a natural glycosylation machinery, this strategy involves building the pathway from zero, enabling the bacterium to produce our desired glycan.

For the yeast part, we knocked out the genes OCH1, Alg3, and Alg11 to prevent the extension of the glycan. Since the nonspecific addition of mannose residues is a expected concern, we also cloned an α-1,2-mannosidase from Trichoderma reesei into S. cerevisiae with an HDEL tag. This tag ensures that the enzyme remains in the endoplasmic reticulum, where it can trim any glycans that have been erroneously extended.

For the bacterial part, we based our experiments on the work of DeLisa https://doi.org/10.1038%2Fnchembio.921. To construct a glycosylation pathway in E. coli, we cloned the Alg13, Alg14, Alg1, and Alg2 genes from S. cerevisiae into a plasmid to enable glycan assembly on the cytoplasmic surface of the inner membrane. To increase the precursor availability for glycan assembly, we also cloned the ManB and ManC genes under the control of a constitutive expression promoter and knocked out the GMD gene to prevent the degradation of mannose, which could serve as a carbon source.

Since E. coli possesses the Wzx flippase that translocates lipopolysaccharide (LPS) precursors to the periplasmic surface, we chose to retain this component in our system to facilitate the transport of our glycans. Additionally, the WaaL protein, known for linking assembled sugars to the bacterial LPS, poses a potential concern as it may reduce the amount of glycans available in the periplasm. However, it can also serve as a tool to evaluate whether our glycan was properly produced. For our target Man3GlcNAc2, the exposed mannose residues can be stained with Alexa Fluor-488, and the presence of fluorescent cells would confirm the successful production of Man3GlcNAc2. Once confirmed, we would need to knock out WaaL to prevent the delivery of glycans to the LPS, thereby increasing the available glycans for our glycoprotein.

A crucial aspect of the bacterial system involves sugar transfer. Campylobacter jejuni possesses the most studied bacterial glycosylation system, featuring the PglB oligosaccharyltransferase (OST), which links glycans to proteins. PglB is extensively researched in efforts to engineer bacterial organisms capable of glycosylation. However, it differs from its eukaryotic homologs; while Stt3 OSTs recognize the N-X-S/T (where X can be any amino acid except proline) sequon, PglB specifically glycosylates asparagine residues within the D/E-X-N-X-S/T (X = any amino acid except proline) sequon. Therefore, it is necessary to adapt the sequence of our protein of interest for expression in systems utilizing PglB. This poses a challenge, particularly when working with large proteins such as the Spike protein from SARS-CoV-2, which contains 22 N-glycosylation sites. Adapting these sites could result in a significantly more negatively charged protein compared to the wild type, potentially compromising its folding and function.

Figure 6. glycosylation sequons in eukaryotes and bacteria

The obvious question that arose was: if we are utilizing the glycosylation machinery from yeast, why not also use the oligosaccharyltransferase (OST)? The issue lies in the organization of this complex. While bacteria like Campylobacter jejuni have a single OST responsible for transferring sugars to asparagine residues, eukaryotes rely on a complex of eight proteins that collaborate in sugar recognition and transfer. The likelihood that this complex would assemble correctly in E. coli is low, making it an unattractive option for cloning purposes.

Figure 7. Membrane proteins responsible for sugar transference. PDB structures of C. jejuni (A) and Homo sapens (B) OSTs.

If the eukaryotic Stt3 is not a viable option, where could we find an OST that could recognize the human glycosilation sequon, but have only one protein that makes the glycan transference? Exploring the literature, we found another group that can also perform glycosylation: archaea. These microorganisms are intriguing due to their unique cellular processes, which have adapted to extreme conditions, and glycosylation plays a crucial role in helping them resist various stressors. Surprisingly, we found that archaea can recognize the human N-X-S/T sequon and, much like bacteria, they rely on just one OST to carry out the glycosyltransferase activity: AglB. This presents a promising alternative for our research, as AglB could potentially provide a simpler and more efficient mechanism for glycosylation in our engineered systems.

Figure 8. Structure and sequence recognition by archaeal OST. A. Alphafold structure of AglB from Methanothermus ferividus. B. Glycosylation sequons recognized by eukaryotes, bacteria and archaeas
Figure 9. Archaeal glycan structures. Source: https://doi.org/10.1038/nrmicro2957

Despite their promise, archaea produce glycans that are entirely different from those found in bacteria and eukaryotes. This divergence could pose a challenge, as the recognition of linking sugars is a critical factor that may hinder their transfer to the target protein. Therefore, rather than using PglB, we opted to focus on three particularly interesting oligosaccharyltransferases (OSTs) from Methanococcus voltae, Methanothermus fervidus, and Sulfolobus acidocaldarius. These OSTs were selected based on the specific substrates they recognize for transfer to proteins, allowing us to explore options that may be more compatible with our goals for glycosylation.

Additionally, we found that lower-order eukaryotes also possess oligosaccharyltransferases (OSTs) that can recognize human glycosylation sequons. Although these OSTs consist of a single protein, they often appear as paralogs within the chromosome. We wanted to incorporate these eukaryotic OSTs into our research; however, we reached the maximum base pairs allowed for synthesis from IDT and Twist while working on the archaeal AglBs and the proteins required for the initial stages of glycan assembly. For this reason, we decided to reserve the eukaryotic Stt3 for the second year of the project, allowing us to integrate it into our future experiments.

To confirm the functionality of our glycosylation system, we need to produce a glycoprotein. However, since human glycoproteins often rely on specific glycosylation profiles for activity that can be dependent of hybrid or complex type glycosilation, producing a simpler glycan could pose a risk of rendering our protein inactive. In light of this, we sought a protein that would retain its activity even with a less complex glycan. This idea was suggested by our second scientific PI, Prof. Dr. Mario Henrique de Barros: the human β-glucocerebrosidase (GCase).

GCase serves as an excellent candidate for our system due to several key characteristics:

  • Clinical Relevance: This enzyme is commercially used to treat Gaucher’s disease, a sphingolipid disorder caused by mutations in GCase that impair the degradation of the sphingolipid glucoceramide in lysosomes. Current treatments involve enzymatic replacement, with available commercial enzymes produced from carrots, human immortalized carcinoma cells, or Chinese hamster ovary cells, resulting in significantly elevated costs.
  • Mechanism of Action: GCase is administered to patients through intravenous injection. In the bloodstream, the exposed mannose residues — compatible with our Man3GlcNac2 — are recognized by macrophages, which phagocytize the protein. Once inside the cytoplasm, the enzyme is delivered to lysosomes, where it performs its crucial function.
  • Glycosylation Sites: GCase contains five glycosylation sites, of which four are actually glycosylated. This characteristic will allow us to compare the glycosylation patterns of our engineered systems with those of the human counterpart.
  • Benchmarking: The three commercially available GCase enzymes provide a reference point, enabling us to compare the enzymatic activity of our glycoprotein against currently used treatments.
Figure 10. Structure and glycosylation of the human GCase. A. Structure of the human GCase highlighting the glycosylated asparagine residues. B. expected profile of glycosilation in the GCase produced in our system.

Selecting human GCase is of significant importance for public health in Brazil. Although Gaucher's disease is considered rare, with approximately 500 patients affected in the country, the production costs for the enzyme can be extremely high, reaching up to $300,000 per patient annually.

(https://www.frontiersin.org/journals/pharmacology/articles/10.3389/)

GCase is provided free of charge to patients through the Unified Health System (Sistema Único de Saúde - SUS). By developing an alternative production method for this therapeutic glycoprotein using Escherichia coli, we could significantly reduce production costs, thereby alleviating the financial burden on the Brazilian healthcare system.

Moreover, our system will be beneficial for the production of a wide range of other glycoproteins, not limited to GCase. This versatility allows us to explore various therapeutic applications, expanding the potential impact of our research. By facilitating the production of essential glycoproteins, this initiative can significantly enhance access to critical treatments, ultimately improving health outcomes for patients across various conditions. The broader implications of this work could lead to advancements in biotechnology and public health, making a meaningful difference in the lives of many individuals.