Typically used by caddisflies to collect and attach aquatic debris to themselves for metamorphosis, caddisfly silk is a good prospective material for its high tensile strength, waterproof properties, and non-cytotoxicity (4). The caddisfly silk protein is a heterodimer of heavy and light fibroins (abbreviated H-fibroin and L-fibroin, respectively). Caddisfly silk gets its adhesive properties from the H-fibroin, which has repeated β-sheet motifs with highly conserved proline-glycine turns (4). The main difficulties in this project lie in the H-fibroin, which has a DNA sequence that is not only highly repetitive but also incredibly long at 19kb (Fig. 1). These qualities make the H-fibroin difficult to synthesize and isolate.
Caddisflies can be categorized based on their use of silk: 1. Cocoon makers use the silk to pupate into cocoons, 2. Tube makers use the silk to build cases with surrounding aquatic debris, and 3. Retreat makers use silk to create stationary shelters and catch prey (4). Intuitively, retreat makers have the strongest silk and cocoon makers have the weakest silk strength because their silk must be strong enough to resist water currents and remain stationary. Initial experimentation relied on Arctopysche grandis (a retreat maker) to synthesize the strongest possible silk to maximize successful recovery with a caddisfly and silkworm-based mesh. However, this year’s dry lab subteam modeled H-fibroin subunits from multiple caddisfly species, ultimately finding that not every subunit could synthesize functional protein. Thus, our focus shifted from silk strength to being able to produce a functional protein.
Last year, the Hopkins iGEM team sought to accomplish this goal by synthesizing an altered H-fibroin protein with only a few motifs. Unfortunately, the dry lab subteam discovered this year that models of this altered H-fibroin yielded nonfunctional protein. Thus, the goal of the 2024 iGEM team was redirected to synthesizing the full H-fibroin protein.
To begin, we decided to switch from an Escherichia coli expression vector to Saccharomyces Cerevisiae instead because a eukaryote is more likely to express eukaryotic caddisfly silk protein. Additionally, S. Cerevisiae can express larger proteins such as the H-fibroin while still retaining the same low-cost, fast-growing, robust properties with the genetic tools of a model organism(2).
Our initial design primarily relies on PCR to isolate the H-fibroin gene. Based on preliminary research from Frandsen Lab, the H-fib subunit gene of the Arctopysche grandis species contains a smaller, less repetitive region in the 10-12kb section of the coding sequence (Fig. 2). Therefore, we can theoretically design two sets of primers that each amplify about half of the H-fib gene. One set would contain a forward primer that binds to the N-terminus sequence and a reverse primer that binds to the low repeat region. The second set would contain a forward primer binding adjacent to the reverse primer of the first set and a reverse primer that binds to the C-terminus. Adding about 10 bases of synthetic complementary sequence on the 5’ of the reverse primers of set 1 and 5’ of forward primers of set 2 would enable us to perform Gibson assembly using the two fragments that we PCR amplify. Unfortunately, we found that even this small low repeat region has some sequence homology scattered throughout the gene. Nevertheless, we expected to be able to purify the correct fragments using gel extraction based on the length of each PCR fragment. Using our design, we should be able to yield one fragment that is approximately 6kb and another that is about 12kb.
We designed two sets of primers to yield two fragments of the H-fibroin gene, with each set having 2 forward and 2 reverse primers. Overhangs were added (in orange, see Fig. 2) to reverse 1 and forward 2 so the two H-fibroin fragments could be annealed in Gibson Assembly later. In anticipation for future golden gate steps, BsaI cut sites were added to forward primer 1 and reverse primer 2. Theoretically, we would be able to produce a scarless H-fib gene following this method.
Using Arctopsyche grandis silk gland samples acquired from Frandsen lab, we extracted caddisfly DNA using the Zymo Tissue/Insect microprep kit and ran this as the template for PCR. A range of annealing temperatures were tested and the reaction was run for an additional 30 seconds per cycle to guarantee that polymerase would be able to replicate sufficient bases to cover 6 or 12 kb. However, we saw that all annealing temperatures instead had bands that are mostly about 1kb, which is far from the expected 12kb and 6kb H-fibroin fragments (Fig. 4). After meeting members from Frandsen Lab, we found that it is likely due to the nonspecific binding of primers to other parts of the bug genome.
It is clear that a PCR-based approach is not appropriate for our purposes as it would be difficult to avoid nonspecific primer binding to other parts of the caddisfly genome. Thus, we turned to conceptualization and literature review to find another method for synthesizing the H-fibroin.
Spider silk is similar to caddisfly silk in that the fibroin gene for spider silk is also highly modular and has been researched for biomaterial use. There have been multiple successful iterations of synthetic spider silk expression in E. coli (1, 3, 5, 6, 7), and thus we sought to replicate one such protocol for caddisfly silk. A paper titled A Protocol for the Production of Recombinant Spider Silk-like Proteins for Artificial Fiber Spinning (7) uses a restriction digest, gel extract, ligation, transformation and miniprep rinse-and-repeat protocol to stitch together motifs and form long, highly repetitive sequences.
Opting instead to synthesize the Atopsyche davidsoni instead of the Arctopsyche grandis H-fibroin for its more unimodular structure (Fig. 1), this synthesis-based method adds appropriate restriction digest sites on the 5’ and 3’ ends of special motifs of the Caddisfly gene. On the 5’ end, XhoI, NdeI, and XmaI restriction sites are added adjacent to each other in that sequence. On the 3’ end, BspEI and BamHI sites are adjacently added in that sequence. We were able to add these overhangs on the N-terminus sequence, repeating motif, and C-terminus sequence of Atopsyche davidsoni and codon optimized these sequences to be compatible with IDT synthesis parameters. By digesting these fragments with XhoI and BamHI, we can integrate each fragment into a pBlueScript II SK(+) backbone containing a ScaI cut site. Then, by digesting one fragment with XmaI and ScaI and another fragment with BspEI and ScaI, we can connect the fragment treated with BspEI in front of the other fragment treated with XmaI (Fig. 5). Deviating from the paper, we also added BsaI to flank the N-terminus, x number of motifs, and C-terminus coding sequence in preparation for later golden gate assembly when building the yeast plasmid. Based on the Florence Teulé et al. protocol, we should be able to continuously and sequentially combine fragments until we reach 81 motifs (Fig. 1) to produce a full H-fibroin protein coding sequence with a single serine scar. We are currently attempting this synthesis-based method:
Before Jamboree we hope to complete an initial proof-of-concept pilot run of the synthesis method with the final product including a N terminus, 3 motifs, and a C terminus. The N and C terminus each contain one motif to yield a total of three motifs after two rounds of ligation. Theoretically, after the addition of enough motifs, the addition of N and C-termini would be added last.
So far, we have started the first phase of this proof-of-concept run; we have already digested our IDT-synthesized motif and pBluescript II SK(+) with XhoI and BamHI, and ligated the two with T4 Ligase. We then transformed the ligated plasmid into DH10B which was then plated on LB plates with Carbenicillin alongside a negative control (Fig. 6). As of 9/30/24, surviving colonies have been inoculated in liquid culture, miniprepped, and sent for sequencing.
The diagram below would be an example of the final product of our synthesis approach. The virtual assembly displays a 3x repeat plasmid with both the N and C terminus added as well as BsaI cut sites flanking both ends of the gene in preparation for future Golden Gate assembly to build the final yeast expression plasmid. Further, given more time and resources, we would be able to go exponentially from 3 repeats to n repeats doubling the number of repeat for every round of synthesis we perform.
Ultimately, we hope to introduce plasmids containing H-fibroin and L-fibroin genes into S. Cerevisiae (lab strain BY4741) for expression of caddisfly silk (or at least the components) synthetically (Fig. 7). Because the main obstacles in synthesizing the caddisfly silk protein involve the H-fibroin, there was no attempt to construct the L-fibroin plasmid. Construction of these plasmids would involve Golden Gate Assembly with parts from the OYC (Open Yeast Collection) provided by the iGEM Distribution Kits (Table 1). This includes parts provided in previous years as well:
Sequence | Description | Linked Sequence | Part ID |
---|---|---|---|
AAAA | plasmid backbone (OYC-eforRed-dropout) | GCAA | BBa_J435281 |
GCAA | Linker | AAAA | IDT |
AAAA | 5’ homology arm (ScHR5'-HO) | AAGG | BBa_J435241 |
AAGG | URA3 selection marker | ATGA | BBa_J435253 |
ATGA | Left connector (uses BbsI sites for lvl2) | GGAG | BBa_J435232 |
GGAG | Adh1 promoter | AATG | BBa_J435200 |
AATG | CDS (L-fibroin or H-fibroin) | CGTT | synthesized |
CGTT | Terminator (TDH1- can mix and match promoters and terminators) | CGCT | BBa_J435230 |
CGCT | Right connector (uses BbsI sites for lvl2) | AGAC | BBa_J435270 |
AGAC | 3’ homology arm (ScHR3'-HO) | CGAA | BBa_J435242 |
CGAA | Origin of transfer | GCAA | BBa_J435286 |
GCAA | E. coli- Amp resistance | ACTA | BBa_J435256 |
ACTA | E. coli- origin | AAAA | BBa_J435289 |
Sequence | Description | Linked Sequence | Part ID |
---|---|---|---|
AAAA | plasmid backbone (OYC-eforRed-dropout) | GCAA | BBa_J435281 |
GCAA | Linker | AAAA | IDT |
AAAA | 5’ homology arm (ScHR5'-HO) | AAGG | BBa_J435241 |
AAGG | URA3 selection marker | ATGA | BBa_J435253 |
ATGA | Left connector (uses BbsI sites for lvl2) | GGAG | BBa_J435232 |
GGAG | Adh1 promoter | AATG | BBa_J435200 |
AATG | CDS (L-fibroin or H-fibroin) w/His-tag on C-terminus | CGTT | synthesized |
CGTT | Terminator (TDH1- can mix and match promoters and terminators) | CGCT | BBa_J435230 |
CGCT | Right connector (uses BbsI sites for lvl2) | AGAC | BBa_J435270 |
AGAC | 3’ homology arm (ScHR3'-HO) | CGAA | BBa_J435242 |
CGAA | Origin of transfer | GCAA | BBa_J435286 |
GCAA | E. coli- Amp resistance | ACTA | BBa_J435256 |
ACTA | E. coli- origin | AAAA | BBa_J435289 |
Two items of note are the linker sequence and the polyhis tag. The linker sequence provided in the OYC collection did not match our needs. In particular, we needed a sequence that has a 5’ GCAA and 3’ AAAA overhang sequence to enable smooth golden gate assembly. Therefore we manually added the above overhangs via IDT synthesis. Further, we added His-tags on the N-terminus of L-fib. This is to be able to perform the purification of L-fib as the L-fib subunit is much smaller (~30 kD) than the H-fib subunit and will elute with other native cellular proteins.
Should we successfully transform the plasmid into BY4741 (confirmed through miniprep and sequencing), we would then proceed to protein purification through HPLC-SEC and His-Tag purification. Later research and literature review would be conducted to determine next steps for constructing the heterodimer protein and post-translational modifications and/or environmental conditions required for functional protein.