Engineering Success

Design

Typically used by caddisflies to collect and attach aquatic debris to themselves for metamorphosis, caddisfly silk is a good prospective material for its high tensile strength, waterproof properties, and non-cytotoxicity (4). The caddisfly silk protein is a heterodimer of heavy and light fibroins (abbreviated H-fibroin and L-fibroin, respectively). Caddisfly silk gets its adhesive properties from the H-fibroin, which has repeated β-sheet motifs with highly conserved proline-glycine turns (4). The main difficulties in this project lie in the H-fibroin, which has a DNA sequence that is not only highly repetitive but also incredibly long at 19kb (Fig. 1). These qualities make the H-fibroin difficult to synthesize and isolate.

Caddisflies can be categorized based on their use of silk: 1. Cocoon makers use the silk to pupate into cocoons, 2. Tube makers use the silk to build cases with surrounding aquatic debris, and 3. Retreat makers use silk to create stationary shelters and catch prey (4). Intuitively, retreat makers have the strongest silk and cocoon makers have the weakest silk strength because their silk must be strong enough to resist water currents and remain stationary. Initial experimentation relied on Arctopysche grandis (a retreat maker) to synthesize the strongest possible silk to maximize successful recovery with a caddisfly and silkworm-based mesh. However, this year’s dry lab subteam modeled H-fibroin subunits from multiple caddisfly species, ultimately finding that not every subunit could synthesize functional protein. Thus, our focus shifted from silk strength to being able to produce a functional protein.

species
Figure 1. “Schematic visualization of the primary structure of the h-fibroin gene of one representative species per clade” from Heckenhauer et al. (DOI: https://doi.org/10.1016/j.isci.2023.107253). We used Arctopsyche grandis initially, but switched to Atopsyche Davidsoni in later methods of synthesis of the H-fibroin protein.

Last year, the Hopkins iGEM team sought to accomplish this goal by synthesizing an altered H-fibroin protein with only a few motifs. Unfortunately, the dry lab subteam discovered this year that models of this altered H-fibroin yielded nonfunctional protein. Thus, the goal of the 2024 iGEM team was redirected to synthesizing the full H-fibroin protein.

To begin, we decided to switch from an Escherichia coli expression vector to Saccharomyces Cerevisiae instead because a eukaryote is more likely to express eukaryotic caddisfly silk protein. Additionally, S. Cerevisiae can express larger proteins such as the H-fibroin while still retaining the same low-cost, fast-growing, robust properties with the genetic tools of a model organism(2).

Our initial design primarily relies on PCR to isolate the H-fibroin gene. Based on preliminary research from Frandsen Lab, the H-fib subunit gene of the Arctopysche grandis species contains a smaller, less repetitive region in the 10-12kb section of the coding sequence (Fig. 2). Therefore, we can theoretically design two sets of primers that each amplify about half of the H-fib gene. One set would contain a forward primer that binds to the N-terminus sequence and a reverse primer that binds to the low repeat region. The second set would contain a forward primer binding adjacent to the reverse primer of the first set and a reverse primer that binds to the C-terminus. Adding about 10 bases of synthetic complementary sequence on the 5’ of the reverse primers of set 1 and 5’ of forward primers of set 2 would enable us to perform Gibson assembly using the two fragments that we PCR amplify. Unfortunately, we found that even this small low repeat region has some sequence homology scattered throughout the gene. Nevertheless, we expected to be able to purify the correct fragments using gel extraction based on the length of each PCR fragment. Using our design, we should be able to yield one fragment that is approximately 6kb and another that is about 12kb.

Build

We designed two sets of primers to yield two fragments of the H-fibroin gene, with each set having 2 forward and 2 reverse primers. Overhangs were added (in orange, see Fig. 2) to reverse 1 and forward 2 so the two H-fibroin fragments could be annealed in Gibson Assembly later. In anticipation for future golden gate steps, BsaI cut sites were added to forward primer 1 and reverse primer 2. Theoretically, we would be able to produce a scarless H-fib gene following this method.

frequency
Figure 2. Example Set of Primers for Isolating and Amplifying H-Fibroin with PCR. Two sets of primers were created based on a less repetitive region at the 10-12kb mark of the coding sequence. Each set of primers consist of a forward 1, reverse 1, forward 2, and reverse 2 primer, which some overlap between reverse 1 and forward 2 primers to allow for subsequent Gibson Assembly.

Test

Using Arctopsyche grandis silk gland samples acquired from Frandsen lab, we extracted caddisfly DNA using the Zymo Tissue/Insect microprep kit and ran this as the template for PCR. A range of annealing temperatures were tested and the reaction was run for an additional 30 seconds per cycle to guarantee that polymerase would be able to replicate sufficient bases to cover 6 or 12 kb. However, we saw that all annealing temperatures instead had bands that are mostly about 1kb, which is far from the expected 12kb and 6kb H-fibroin fragments (Fig. 4). After meeting members from Frandsen Lab, we found that it is likely due to the nonspecific binding of primers to other parts of the bug genome.

gel-pcr
Figure 4. Attempted PCR of H-Fibroin Using Two Different Sets of Primers. Lanes labeled as the following: 1. 1kb ladder 4. First half primer set 1 (forward 1, reverse 1) 5. Second half primer set 1 (forward 2, reverse 2) 6. First half primer set 2 (forward 1, reverse 1) 7. Second half primer set 2 (forward 2, reverse 2) 9. Only template 10. Only primers 11. Enzyme mixture 12. Reagents only

Learn

It is clear that a PCR-based approach is not appropriate for our purposes as it would be difficult to avoid nonspecific primer binding to other parts of the caddisfly genome. Thus, we turned to conceptualization and literature review to find another method for synthesizing the H-fibroin.

Spider silk is similar to caddisfly silk in that the fibroin gene for spider silk is also highly modular and has been researched for biomaterial use. There have been multiple successful iterations of synthetic spider silk expression in E. coli (1, 3, 5, 6, 7), and thus we sought to replicate one such protocol for caddisfly silk. A paper titled A Protocol for the Production of Recombinant Spider Silk-like Proteins for Artificial Fiber Spinning (7) uses a restriction digest, gel extract, ligation, transformation and miniprep rinse-and-repeat protocol to stitch together motifs and form long, highly repetitive sequences.

Opting instead to synthesize the Atopsyche davidsoni instead of the Arctopsyche grandis H-fibroin for its more unimodular structure (Fig. 1), this synthesis-based method adds appropriate restriction digest sites on the 5’ and 3’ ends of special motifs of the Caddisfly gene. On the 5’ end, XhoI, NdeI, and XmaI restriction sites are added adjacent to each other in that sequence. On the 3’ end, BspEI and BamHI sites are adjacently added in that sequence. We were able to add these overhangs on the N-terminus sequence, repeating motif, and C-terminus sequence of Atopsyche davidsoni and codon optimized these sequences to be compatible with IDT synthesis parameters. By digesting these fragments with XhoI and BamHI, we can integrate each fragment into a pBlueScript II SK(+) backbone containing a ScaI cut site. Then, by digesting one fragment with XmaI and ScaI and another fragment with BspEI and ScaI, we can connect the fragment treated with BspEI in front of the other fragment treated with XmaI (Fig. 5). Deviating from the paper, we also added BsaI to flank the N-terminus, x number of motifs, and C-terminus coding sequence in preparation for later golden gate assembly when building the yeast plasmid. Based on the Florence Teulé et al. protocol, we should be able to continuously and sequentially combine fragments until we reach 81 motifs (Fig. 1) to produce a full H-fibroin protein coding sequence with a single serine scar. We are currently attempting this synthesis-based method:

plasmid
Figure 5. “Strategy to build large synthetic spider silk-like tandem repeat sequences from small double-stranded monomer DNAs flanked by compatible but non regenerable restriction sites” from Florence Teulé et al. (DOI: https://doi.org/10.1038/nprot.2008.250) Their description: (a) The engineered silk-like module with appropriate flanking restriction sites is cloned in the plasmid vector. (b) The recombinant plasmid is subjected to two separate restriction digestions and, in both cases, fragments containing the insert are isolated and ligated to each other. (c) The resulting plasmid contains an insert that was doubled in size and has a nonfunctional internal XmaI/BspEI hybrid site. The black stars (★) indicate the restriction digestion of DNA and N× means that the strategy can be repeated as many times as needed.

Before Jamboree

Before Jamboree we hope to complete an initial proof-of-concept pilot run of the synthesis method with the final product including a N terminus, 3 motifs, and a C terminus. The N and C terminus each contain one motif to yield a total of three motifs after two rounds of ligation. Theoretically, after the addition of enough motifs, the addition of N and C-termini would be added last.

So far, we have started the first phase of this proof-of-concept run; we have already digested our IDT-synthesized motif and pBluescript II SK(+) with XhoI and BamHI, and ligated the two with T4 Ligase. We then transformed the ligated plasmid into DH10B which was then plated on LB plates with Carbenicillin alongside a negative control (Fig. 6). As of 9/30/24, surviving colonies have been inoculated in liquid culture, miniprepped, and sent for sequencing.

negative-control expculture
Figure 6. Transformation Results for Proof of Concept Synthesis Experiment. Left hand side: negative control, air bubbles under the agar due to lack of pre-warming step, ultimately observed no opaque colonies on top of agar; Right hand side: experimental plate, air bubbles under agar due to lack of pre-warming step, observed 10 opaque colonies (in particular, an opaque colony near the middle of the plate).

Future Direction

The diagram below would be an example of the final product of our synthesis approach. The virtual assembly displays a 3x repeat plasmid with both the N and C terminus added as well as BsaI cut sites flanking both ends of the gene in preparation for future Golden Gate assembly to build the final yeast expression plasmid. Further, given more time and resources, we would be able to go exponentially from 3 repeats to n repeats doubling the number of repeat for every round of synthesis we perform.

assembled-pbl

Ultimately, we hope to introduce plasmids containing H-fibroin and L-fibroin genes into S. Cerevisiae (lab strain BY4741) for expression of caddisfly silk (or at least the components) synthetically (Fig. 7). Because the main obstacles in synthesizing the caddisfly silk protein involve the H-fibroin, there was no attempt to construct the L-fibroin plasmid. Construction of these plasmids would involve Golden Gate Assembly with parts from the OYC (Open Yeast Collection) provided by the iGEM Distribution Kits (Table 1). This includes parts provided in previous years as well:

h-fib l-fib
Figure 7. H-Fib and L-Fib Final Plasmids to be Transformed into Yeast after Golden Gate Construction.

Table 1a. H-Fibroin Yeast Plasmid Construction with Distribution Kit Parts

Sequence Description Linked Sequence Part ID
AAAA plasmid backbone (OYC-eforRed-dropout) GCAA BBa_J435281
GCAA Linker AAAA IDT
AAAA 5’ homology arm (ScHR5'-HO) AAGG BBa_J435241
AAGG URA3 selection marker ATGA BBa_J435253
ATGA Left connector (uses BbsI sites for lvl2) GGAG BBa_J435232
GGAG Adh1 promoter AATG BBa_J435200
AATG CDS (L-fibroin or H-fibroin) CGTT synthesized
CGTT Terminator (TDH1- can mix and match promoters and terminators) CGCT BBa_J435230
CGCT Right connector (uses BbsI sites for lvl2) AGAC BBa_J435270
AGAC 3’ homology arm (ScHR3'-HO) CGAA BBa_J435242
CGAA Origin of transfer GCAA BBa_J435286
GCAA E. coli- Amp resistance ACTA BBa_J435256
ACTA E. coli- origin AAAA BBa_J435289

Table 1b. L-Fibroin Yeast Plasmid Construction with Distribution Kit Parts

Sequence Description Linked Sequence Part ID
AAAA plasmid backbone (OYC-eforRed-dropout) GCAA BBa_J435281
GCAA Linker AAAA IDT
AAAA 5’ homology arm (ScHR5'-HO) AAGG BBa_J435241
AAGG URA3 selection marker ATGA BBa_J435253
ATGA Left connector (uses BbsI sites for lvl2) GGAG BBa_J435232
GGAG Adh1 promoter AATG BBa_J435200
AATG CDS (L-fibroin or H-fibroin) w/His-tag on C-terminus CGTT synthesized
CGTT Terminator (TDH1- can mix and match promoters and terminators) CGCT BBa_J435230
CGCT Right connector (uses BbsI sites for lvl2) AGAC BBa_J435270
AGAC 3’ homology arm (ScHR3'-HO) CGAA BBa_J435242
CGAA Origin of transfer GCAA BBa_J435286
GCAA E. coli- Amp resistance ACTA BBa_J435256
ACTA E. coli- origin AAAA BBa_J435289

Two items of note are the linker sequence and the polyhis tag. The linker sequence provided in the OYC collection did not match our needs. In particular, we needed a sequence that has a 5’ GCAA and 3’ AAAA overhang sequence to enable smooth golden gate assembly. Therefore we manually added the above overhangs via IDT synthesis. Further, we added His-tags on the N-terminus of L-fib. This is to be able to perform the purification of L-fib as the L-fib subunit is much smaller (~30 kD) than the H-fib subunit and will elute with other native cellular proteins.

Should we successfully transform the plasmid into BY4741 (confirmed through miniprep and sequencing), we would then proceed to protein purification through HPLC-SEC and His-Tag purification. Later research and literature review would be conducted to determine next steps for constructing the heterodimer protein and post-translational modifications and/or environmental conditions required for functional protein.

  1. Bhattacharyya, G., Oliveira, P., Krishnaji, S. T., Chen, D., Hinman, M., Bell, B., Harris, T. I., Ghazitabatabaei, A., Lewis, R. V., & Jones, J. A. (2021). Large scale production of synthetic spider silk proteins in Escherichia coli. Protein Expression and Purification, 183, 105839. https://doi.org/10.1016/j.pep.2021.105839
  2. Demain, A. L., & Vaishnav, P. (2009). Production of recombinant proteins by microbes and higher organisms. Biotechnology Advances, 27(3), 297–306. https://doi.org/10.1016/j.biotechadv.2009.01.008
  3. Fukushima, Y. (1998). Genetically engineered syntheses of tandem repetitive polypeptides consisting of glycine-rich sequence of spider dragline silk. Biopolymers, 45(4), 269–279. https://doi.org/10.1002/(sici)1097-0282(19980405)45:4
  4. Heckenhauer, J., Stewart, R. J., Ríos-Touma, B., Powell, A., Dorji, T., Frandsen, P. B., & Pauls, S. U. (2023). Characterization of the primary structure of the major silk gene, h-fibroin, across caddisfly (Trichoptera) suborders. iScience, 26(8), 107253. https://doi.org/10.1016/j.isci.2023.107253
  5. Lewis, R. V., Hinman, M., Kothakota, S., & Fournier, M. J. (1996). Expression and purification of a spider silk protein: a new strategy for producing repetitive proteins. Protein Expression and Purification, 7(4), 400–406. https://doi.org/10.1006/prep.1996.0060
  6. Prince, J. T., McGrath, K. P., DiGirolamo, C. M., & Kaplan, D. L. (1995). Construction, cloning, and expression of synthetic genes encoding Spider Dragline Silk. Biochemistry, 34(34), 10879–10885. https://doi.org/10.1021/bi00034a022
  7. Teulé, F., Cooper, A. R., Furin, W. A., Bittencourt, D., Rech, E. L., Brooks, A., & Lewis, R. V. (2009). A protocol for the production of recombinant spider silk-like proteins for artificial fiber spinning. Nature Protocols, 4(3), 341–355. https://doi.org/10.1038/nprot.2008.250