Engineering

Introduction

Type I pili are encoded by the fim operon in Escherichia coli and are filamentous structures on the surface of the cell [1]. To prevent interference from the native pili, previous studies that have utilised E. coli as a chassis organism for the heterologous expression of type IV e-pili first created a strain of E. coli lacking the fimA gene (Ecocyc ID b0530), which encodes for the major structural subunit of type I pili, to serve as the host for further e-pili expression [2]. As our project involved the expression of a heterologous pili monomer in E. coli, our first aim was to create a strain of E. coli that does not express type I pili to serve as a chassis in further steps of our project. In the second phase of our project, we aimed to express type IV e-pili modified with a collagen binding peptide in the pili-deficient strain and test for the collagen-binding efficiency of these fusion proteins. Currently, there are numerous experimental procedures available that can determine the binding affinity of a substrate to a protein. However, most of them are very cost ineffective and are not available without specialised equipment (e.g., ELISA, radioligand binding). For this reason, we have decided on using a live cell adhesion assay, which provides an indication whether our recombinant cells adhere to collagen. Additionally, we have adapted a protocol reported in a previous study [3] by substituting crystal violet staining, used to visualise the cells through absorbance measurement, with a fluorescence-based quantification method. Instead of staining the cells, we equipped them with a plasmid which expresses red fluorescent protein (RFP). Our method for testing therefore is a more simplified and sensitive approach.

Iteration 1 - Choice of Chassis Strain

Design:

Initially, we chose E. coli NEB10b ΔfimA as our final chassis strain because it was previously used by Ueki et al., 2020 [2] for the heterologous expression of type IV e-pili. To build this strain, unlike Ueki et al., 2020, we decided to leverage the CRISPR-Cas12a system designed by Jervis et al., 2021 [4] in order to delete the fimA gene of E. coli NEB10b. Clustered Regularly Interspaced Short Palindromic Repeats (CRISPR) and their associated proteins, Cas9 and Cas12a (Cpf1), among others, is a genome editing system often used due to its efficiency and ease of use [5]. CRISPR genome editing systems incorporate a gRNA that is integrated into a Cas protein. The gRNA directs the Cas protein to the edit site to produce a double-stranded break (DSB) in the DNA. The system then relies on the cell’s intrinsic homology-directed repair (HDR) mechanism to repair the DSB with provided dDNA [6]. The CRISPR-Cas12a system of Jervis et al., 2021 [4] employs two plasmids: the first plasmid, pSIMcpf1 (Addgene ID: 153034), carries the Cas12a nuclease and the Lambda Red recombination genes. The second plasmid (referred to as pTF) expresses the guide RNA (gRNA) and carries the repair template, composed of two 50-bp homologous arms. Because of its modularity, we only needed to introduce our target gene spacer and donor DNA (dDNA) sequence into the pTF plasmid. Following the protocol of [4], we designed our NEB10b fimA spacer and homology arms. First, the target gene was screened for Cas12 protospacer adjacent motifs (PAM) with the sequence TTTV (where V is A, C, or G). We selected a PAM located in the middle of the gene and used the proceeding 23-bp sequence as fimA spacer. Our custom dDNA was composed of the left and right homology arms of the fimA gene; the left homology arm was designed to be the 50 bp upstream of the gene’s 5’ end, including the start codon of fimA, while the right homology arm was designed to contain 50 bp of the gene’s 3’ end, including the last 9 codons.

DPlasmid maps of our pTF_fimA plasmid, the pSIMcfp1 plasmid, and the pTF plasmid.
Figure 1. Plasmid maps of our pTF_fimA plasmid, the pSIMcfp1 plasmid, and the pTF plasmid. Diagrams of the pSIMcfp1 plasmid and pTF plasmid were directly sourced from [4]. Our custom fimA gene spacer, noted by “fimA spacer 1,” and dDNA sequence (homology arms), labelled “LHA” and “RHA,” were designed to be cloned into the pTF plasmid. We used the pSIMcfp1 plasmid without modification.

Build:

We were kindly provided with the original pSIMcfp1 plasmid and the pTF plasmid (also named pTargettyrRsingle plasmid) used in the original study. We could use the pSIMcfp1 plasmid directly, but the pTF plasmid needed to be tailored to our use. To do so, we chose to design primers (Figure 2B and 2C) to amplify the pTF plasmid by PCR in two parts containing our desired spacers and homology arms and subsequently assemble the two fragments by HiFi DNA assembly into one single pTF_fimA plasmid (Figure 2A). Simultaneously, we transformed E. coli NEB10b cells with plasmid pSIMcfp1.

Figure 2. Oligonucleotide primers used in the customisation of the pTF plasmid for the deletion of the fimA gene.
Figure 2. Oligonucleotide primers used in the customisation of the pTF plasmid for the deletion of the fimA gene. [A] Overall positioning of the primers on the final plasmid, pTF_fimA, [B] forward and reverse primers to introduce the fimA spacer and the left homology arm, [C] forward and reverse primers to introduce the right homology arm and amplify the plasmid backbone.

To confirm whether the assembly was successful, we sequenced the resulting plasmid by Sanger sequencing and confirmed that we had obtained the desired pTF_fimA plasmid. This plasmid was subsequently transformed into the E. coli NEB10b cells that already carried plasmid pSIMcfp1, initiating the CRISPR-Cas12a-mediated deletion of fimA in E. coli NEB10b.

Test:

To confirm whether the CRISPR-Cas12a-mediated deletion of the fimA gene was successful, we performed colony PCR of 10 transformants.

Figure 3. Primers for colony PCR of the NEB10b genome to confirm the deletion of the fimA gene
Figure 3. Primers for colony PCR of the NEB10b genome to confirm the deletion of the fimA gene. Created with Biorender.com
Figure 4. Agarose gel electrophoresis results of colony PCR DNA fragments (Day 6 in Gene Deletion Lab Book).
Figure 4. Agarose gel electrophoresis results of colony PCR DNA fragments (Day 6 in Gene Deletion Lab Book). Lane 1: Ladder, Lanes 2 to 11: colony PCR results for 10 colonies, and Lane 12: Ladder. Lane 1: Ladder, Lanes 2 to 11: colony PCR results for 10 colonies, and Lane 12: Ladder. If the fimA gene was successfully removed, a PCR fragment of 692 bp was expected, while an unsuccessful gene deletion would be indicated by a fragment length 1205 bp. As seen, the colony PCR failed for colonies 2, 3, and 10 (lanes 3,4, and 11), while the other 7 colony PCR reactions all yielded the wild type genotype of 1205 bp, meaning that the fimA gene deletion was unsuccessful.

As seen in Figure 4, the fimA deletion was completely unsuccessful. Therefore, to diagnose the reason behind the failure, we chose to restreak the cells transformed with both the pTF_fimA and pSIMcfp1 plasmids onto new plates. As the plasmids provided streptomycin and hygromycin resistance, respectively, we restreaked the cells onto LB agar plates containing both antibiotics (LB+Streptomycin+Hygromycin) as well as plates with only one of the antibiotics (LB+Streptomycin and LB+Hygromycin, respectively).

Learn:

Through the results of our diagnostic plates, we discovered that E. coli NEB10b had an intrinsic streptomycin resistance, which correlated with why the cells without plasmid pTF_fimA were able to grow.

Figure 5. Diagnostic plates from the day before.
Figure 5. Diagnostic plates. Separate day cultures of cells transformed with both plasmids (labelled as CRISPR A and CRISPR B), as well as cells transformed with only plasmid pSIMcfp1 (A and B respectively) were restreaked onto [A] LB+Hygromycin, [B] LB+Streptomycin, and [C] LB+Streptomycin+Hygromycin.

All restreaks grew on plate [A] as expected, while CRISPR B unexpectedly did not grow on plate [C] even though it should have resistance to both streptomycin and hygromycin. However, the most concerning issue was the growth of cells that were transformed with only the pSIMcfp1 plasmid (A, B) when they were restreaked on both [B] and [C], as the pSIMcfp1 plasmid only provided hygromycin resistance. This growth led us to discover that NEB10b had intrinsic streptomycin resistance.

Faced with this problem, we were presented with two general options: we could either modify plasmid pTF_fimA and substitute the antibiotic resistance gene for an alternative resistance marker (e.g., ampR) for the smR gene, or we could research alternative E. coli strains to determine whether they had the same fimA gene sequence as NEB10b. We chose the second option, because our plasmids could be directly applied in a strain with the same fimA sequence as NEB10b, saving the time of reconstructing plasmid pTF_fimA. Therefore, we began to investigate various lab strains, notably the E. coli NEB5a strain as a potential strain option, considering its close relationship to the NEB10b strain as well as its comparable transformation efficiency.

Iteration 2 - Gene Deletion

Design:

To confirm whether plasmid pTF_fimA was directly usable in E. coli NEB5a, we analysed the NEB5a genome for similar protein sequences as the NEB10b fimA gene. We discovered that the NEB5a strain carries a gene labelled NEB5a_02280 in its genome (Genbank ID: AOO68850.1) that encodes for the same fimbrial protein as the NEB10b fimA gene. This gene also possesses the exact same nucleotide sequence across both strains, meaning that we could immediately begin the fimA deletion procedure in NEB5a cells with our plasmids.

Table Comparision
Figure 6: Sequences of NEB10b fimA and its corresponding NEB5a counterpart.

Build:

We transformed the pSIMcfp1 and pTF_fimA plasmids sequentially into NEB5a following the same protocol we used for the NEB10b cells (See Gene Deletion Lab Book for details).

Test:

We also discovered that our colony PCR primers (Figure 3) bound in the exact same way as they did in the NEB10b genome, meaning that the colony PCR could be performed and results could be interpreted in the same way as for our first iteration.

Figure 7. Primers for colony PCR of the NEB10b genome to confirm the deletion of the NEB5a_02280 gene.
Figure 7. Primers for colony PCR of the NEB10b genome to confirm the deletion of the NEB5a_02280 gene. Created with Biorender.com
Figure 8. Colony PCR results of E. coli NEB5a following the protocol for chromosomal deletion of NEB5a_02280.
Figure 8. Colony PCR results of E. coli NEB5a following the protocol for chromosomal deletion of NEB5a_02280. Lane 1: 1 kb ladder, Lanes 2 to 11: colony PCR results for 10 colonies, Lane 12: negative control (resuspended cells were substituted with MQ H2O), Lane 13: positive control of E. coli NEB5a, Lanes 14 and 15: 1 kb ladder.

Just like our previous colony PCR, a resulting PCR product of 692 bp was expected if the gene was successfully removed, while an unsuccessful gene deletion would be indicated by a fragment length 1205 bp. As showcased in Figure 8, the gene deletion efficiency was 50% (5/10 colonies screened). For one colony (lane 4 in Figure 2), a mixed genotype could be observed.

Learn:

We successfully deleted NEB5a_02280 in the NEB5a genome, but we eventually became aware through genome analysis that NEB10b did not possess the K12 MG1655 fim operon and therefore did not have the K12 MG1655 fimA sequence. Rather, the gene sequence normally labelled sfmA in K12 MG1655 was labelled as “fimA” in the NEB10b strain due to its high protein homology with fimA, meaning that the NEB5a_02280 gene we deleted was actually the sfmA gene of NEB5a instead of its genuine fimA gene. This led us to discover that E. coli K12 MG1655, and accordingly, the NEB5a strain, possessed multiple chaperone-usher operons homologous to the fim operon. The most highly expressed of these homologous operons in standard laboratory conditions was the sfm operon [7], which we have incidentally disrupted by removing sfmA. Interestingly, it has also been shown that the removal of any of these chaperone-usher operons provides benefits to cell growth and can potentially increase the biosafety of the strain [8]. Therefore, instead of deleting the fimA gene of wild-type NEB5a cells, we determined it would allow us to capitalise on these benefits and also achieve our original goal of preventing native pili expression if we created an E. coli NEB5a ΔsfmAΔfimA strain by deleting the fimA gene of our NEB5a ΔsfmA strain. In order for us to do this, we designed a plasmid with the actual NEB5a fimA spacer and homology arms. We named this new plasmid pTF_fimA and changed the name of the original pTF_fimA plasmid to the pTF_sfmA plasmid to be consistent (diagrams of Figures 1 and 7 adjusted for this are shown in our Results page).

Figure 9. Proposed pTF_fimA plasmid Map
Figure 9. Plasmid map for the proposed pTF_fimA plasmid and a close-up diagram of its fragment cassette. Our bespoke 23-bp fimA gene spacer, indicated by “fimA spacer,” and dDNA sequence (homology arms), labelled “LHA” and “RHA,” are shown in both diagrams.

We ultimately did not have time to create and use the newly designed pTF_fimA plasmid (Figure 9) to delete the fimA gene in our NEB5a ΔsfmA strain. However, we consider that our NEB5a ΔsfmA strain was sufficient to serve as our final chassis, as the native fimbriae should only impact the yield of our desired heterologous type IV e-pili and not its expression (which we designed to be controlled by IPTG induction).

Iteration 3 - Collagen Binding Assay

Design:

After creating an E. coli ΔsfmA strain (initially thought to be ΔfimA) to maximise production of recombinant pili, our plan was to express e-pili with collagen-binding tags. The presence of fimbriae in our strain should not impact the results of the collagen binding assay, as we expected there to be a significant difference of collagen-binding capacity between the fimbriae-expressing cells (E. coli ΔsfmA) and the fimbriae-expressing cells that also expressed the e-pili with the collagen-binding tag. There were multiple options for plasmid construction, and we considered two distinct approaches:

  1. A single plasmid containing all genes needed for the type IV pilus assembly machinery and gspilA with the protein-binding tag.
  2. Two plasmids: one containing gspilA, including protein-binding tag, and one half of the pili assembly machinery, while the other contained the remaining second half of the assembly machinery.
We decided to use two plasmids to produce the recombinant pili, as the two-plasmid design is more flexible and allows us to change the tags easily. Furthermore, the size of a single plasmid containing all of the genes that code for the assembly machinery would have been over 10000 bp long, resulting in a a decrease in transformation efficiency.

Build:

To build the plasmid system, we selected two compatible BglBrick vectors: pBbA1k and pBbE1c. The modified version of plasmid pBbA1k (hereafter referred to as A1k) carried one half of the assembly machinery, while the modified version of plasmid pBbE1c (hereafter referred to as E1c) contained an unmodified version of gspilA (referred to as E1c-WT), gspilA with a collagen-binding tag (TKKTLRT, named E1c-C1; or LRELHLNNN, named E1c-C2) or a His-tag, as well as the second half of the assembly machinery (see our Results page for diagram). The plasmids have distinct origins of replication and antibiotic resistance markers to ensure uniform replication and avoid plasmid loss during co-transformation. In order for the cells to be used in the adapted collagen-binding assay, we co-transformed these cells with a third plasmid, pBbS1a-RFP (hereafter referred to as S1a), that equipped the cells with the RFP enabling us to determine the cell’s capacity to bind to collagen by measuring RFP fluorescence.

Test:

The following strains were tested for their ability to bind to collagen:

  • E. coli ΔsfmA (negative-control strain)
  • E. coli ΔsfmA A1k S1a (control strain lacking gspilA)
  • E. coli ΔsfmA A1k S1a E1c-WT (control strain expressing the unmodified gspilA)
  • E. coli ΔsfmA A1k S1a E1c-C1 (strain expressing gspilA, modified to encode the collagen-binding tag “TKKTLRT” at its 3’ end)
  • E. coli ΔsfmA A1k S1a E1c-C2 (strain expressing gspilA, modified to encode the collagen-binding tag “LRELHLNNN” at its 3’ end)
  • The strains were grown on LB-agar containing the appropriate antibiotics, resuspended in phosphate-buffered saline (PBS), and transferred to wells of 96-well microtitre plates that are coated with rat tail collagen 1 (Thermo Fisher Scientific; A1142803) and regular non-collagen coated plates. After an incubation period the wells were washed to remove cells that had not bound to the well surface. Subsequently, RFP fluorescence was quantified using a plate reader. For a detailed protocol, see Protocols.

    Figure 10. Fluorescence readings of transformed E.coli NEB5a strains.
    Figure 10. Absolute fluorescence units of the various E.coli NEB5a strains. The first four strains were added to the wells at an OD600 of 2.0. The strain carrying E1c-C2 was added at an OD600 of 0.5. Number of technical replicates: 4.

    The fluorescence reading of a control plate with unwashed wells was measured to find out the baseline fluorescence level of the cultures.

    Table 1. Absolute fluorescence units of cultures after the incubation step. Separate wells of the non-coated plate were not washed and contain the original culture. The first four strains were added to the wells at an OD600 of 2.0. The strain carrying E1c-C2 was added at an OD600 of 0.5. Number of technical replicates: 4.

    100 μL culture Average Stdev
    ΔsfmA 43.0 4.0
    ΔsfmA A1k S1a 48445.3 284.8
    ΔsfmA A1k S1a E1c-WT 33558.8 399.5
    ΔsfmA A1k S1a E1c-C1 35240.8 422.5
    ΔsfmA A1k S1a E1c-C2 6558.3 144.0

    Learn:

    The results from the collagen binding assay might suggest that the bacteria generally bind better to the control plate than to the collagen-coated plate. The control strain ΔsfmA exhibited a low RFP signal, indicating cell autofluorescence. E. coli that was transformed with the plasmid containing the minor pilin and prepilin-peptidase genes, A1k, showed larger fluorescence transmission values in all three of the experiments (Figure 10 and Table 1) compared to the bacteria that supposedly expressed wild-type pili and pili modified with the collagen 1 tag. The fluorescence level of strain E. coli ΔsfmA A1k S1a in liquid medium (Table 1) is significantly higher than the level in strains E. coli ΔsfmA A1k S1a E1c-WT and E. coli ΔsfmA A1k S1a E1c-C1. This is most likely the result of a decreased metabolic burden on the cells due to the lack of the third plasmid. The data from Figure 10 shows that E.coli transformed with E1c - C2 also lacks the ability to bind to specifically bind to collagen . Potential causes for these results could be the following:

    • Metabolic Burden
    • As mentioned above, metabolic burden could be a cause for these results.This is the most likely reason, since the data in Table 1 shows this trend in the unwashed control plates. Potential fixes for this could be changing the expression chassis, creating a design with less plasmids or even using the original method from [3], which stained the cells with crystal violet instead of transforming them. This is the most likely reason for the bacteria not binding to collagen, since the data in Table 1 shows this trend.
    • Collagen Binding Tag
    • The second reason for the lack of collagen binding could be that the tags modified the structure of the pilin monomer, potentially leading to misshapen or possibly the absence of pili. A way to test this would be to take electron microscope pictures of the cells and check for the presence of pili.
    • Pili Protein Expression
    • The GsPilA protein may not have been expressed sufficiently, leading to no pili formation. Thus, it is critical to check for its presence using a Western Blot. Western Blotting the GsPilA with His will test for pili expression. After pili expression is determined, the protein yield can be optimised. This can be remediated using a Design of Experiments (DoEs) aimed to optimise protein yield and expression and increase pili formation.

      DoEs is applied statistics that deals with planning, conducting, analysing, and interpreting controlled tests to evaluate the factors that control the value of a parameter or group of parameters. The main goal of this is to optimise the use of factors that affect the yield of pili to maximise protein expression. This can be done through two ways: a full factorial or a fractional factorial. In this case, utilising a full factorial will be optimal as it eliminates ambiguity caused by confounding factors by isolating them across different variables, unlike fractional factorial designs, which may leave certain interactions unresolved.

      These are the factors chosen that may affect pili production:
      1. Incubation temperature - Newly transcribed recombinant proteins have more time to fold correctly when protein expression occurs at slower rates. Furthermore, correct folding is favoured by the decrease in the concentration of cellular proteins. Reducing incubation temperature may lower protein aggregation as folding of proteins is temperature dependent [9].
      2. IPTG Concentration - IPTG is responsible for removing the repressor from the lac operon to induce gene expression [10].
      3. Induction Time of cells - Heterologous proteins in E.coli require longer times and/or molecular chaperones to fold correctly, thus optimal induction time may be critical in protein folding and formation [11]. After a fixed time of initial incubation on plates (ex. 48 hours, as our cells tended to grow very slowly), cells would be scraped off, resuspended in media, and plated onto new IPTG-containing plates. These plates can then be incubated for various lengths of time to vary induction time.
      4. Media Type Carbon Source - Utilising two different carbon sources would demonstrate its role in protein expression, as E.coli consumes glucose preferentially over other carbon sources by system regulation and Carbon Catabolite Repression (CCR), which involves a sugar transport system known as phosphoenolpyruvate-phosphotransferase system (PTS system) [12]. When creating M9, the carbon source choice in the protocol used was glycerol, choosing glucose as a carbon source might facilitate high protein integrity as glycerol is linked to faulty protein folding [13].

    Factors that we can change to maximise protein expression:

    Incubation Temperature {2 levels}

    • 16 ℃
    • 30 ℃
    Inducer Concentration (IPTG) {2 levels}
    • 50 μM
    • 250 μM
    Induction Time of Cells (Depends on Temperature and IPTG Concentration) {2 levels}
    • 24 hrs / Day 1
    • 48 hrs / Day 2
    Media Type {2 levels}
    • M9 Composition
      • Glucose
      • Glycerol

    Prospective Plan: We will first do a trial run to see the feasibility of the experiment and adjust it accordingly to ensure it works. We do a full factorial unreplicated design of 40 trials to screen the best possible combination. See the math below. Factor 1 = 2 levels, Factor 2 = 2 levels , Factor 3 = 2 levels, Factors 4 = 2 levels

    Find out how many trials we need to run!

    2 x 2 x 2 x 2 =

    After knowing which combination produces the most protein, we will test again to verify our initial conclusion and replicate the experiment/method 3 different times with our chosen factors. This full factorial analysis will help find the optimal parameters of the factors chosen to increase protein expression in E.coli.

    References