sgRNA Results
Discovery and Design
We used the Bos taurus genome sequence from a Hereford cow (BioProject accession number PRJNA450837) as a starting point to design our spacer sequences for Cas13a.
The DNA sequences that the literature suggested were nucleic acid biomarkers of bTB infection in cattle detectable in both blood[1] and tissue samples[2], were located and using the associated annotations the introns were removed using the splicing function in SnapGene. The resulting mRNA sequences were inputted into a Python script that screened for potential spacer sequences.
Cas13a Spacer Sequence Finder Python Script
The Cas13a-CRISPR system requires that the protospacer flanking sequence (PFS), the adjacent nucleotide to the 3’ end of the target site, must be a non-guanine[3]. Therefore, the Python script looked for sequences 25-nucleotides long where the 25th nucleotide was non-G. The resulting 24-nucleotide spacer sequences that included either a CCCC or GGGG repeats were filtered out, as the presence of these would cause misfolding with the crRNA hairpin loop. In addition, sequences with more than one uracil base were filtered out, as uracil bases easily bind to other RNA nucleotides. The Cas13a crRNA sequence was appended to the 5’ end of each spacer sequence which were then analysed for secondary structure using the on-line software IPknot++[4]. Final spacer sequences were chosen by the highest minimum free energy, denoting least intra-sequence binding within the spacer region, and minimum inter-sequence binding between the spacer and crRNA sequences.
After designing the spacer sequences we decided to focus on the following genes as RNA biomarkers:
Gene |
Description[1] |
CXCL8 |
Chemokine ligand 8, involved in infection response and tissue injury. |
FOSB |
FBJ murine osteosarcoma viral oncogene homologue B, plays a role in regulating cell proliferation, differentiation and transformation. |
NR4A1 |
Nuclear receptor subfamily 4, group A, member 1, plays a role in inflammation and apoptosis. |
PLAUR |
Plasminogen activator, urokinase receptor, a biomarker of inflammation. |
RGS16 |
Regulator of G-protein signalling 16, linked to many different disease states. |
[Table 1]
For Cas12a the DNA sequence coding for each respective gene we looked to target was downloaded and inputted into a Python script that screened for potential spacer sequences.
Cas12a Spacer Sequence Finder Python Script
The Cas12a-CRISPR system requires that the protospacer-adjacent motif (PAM), the four nucleotides 3’ to the end of the target site, must be TTTV. Therefore, the Python script looked for sequences 24-nucleotides long where final four nucleotides were either TTTA, TTTC or TTTG[5]. The protentual spacer sequences were then filtered to again remove; any sequences with CCCC or GGGG repeates (which interfiear with the crRNA hairpin bonding); any sequences with more than one uracil base (since uracil easily binds to other RNA nucleotides). The Cas12a crRNA sequence was appended to the 5’ end of each spacer sequence which were then analysed for secondary structure using the on-line software IPknot++[3]. Final spacer sequences were chosen by the highest minimum free energy, denoting least intra-sequence binding within the spacer region, and minimum inter-sequence binding between the spacer and crRNA sequences.
PCR Attempts and Initial Results
The selected Spacer sequences (appended to the crRNA sequences) were ordered from IDT as DNA, in sequences containing a T7 promoter to allow in vitro T7 transcription. Target sequences (complementary to the spacer sequences), that would be required for testing our final Cas12a and Cas13a systems, were also ordered from IDT [Figure 1].
[Figure 1] Abstract representation of DNA sequences synthesised by IDT.
A) Template DNA to be used in transcription of Cas13a and Cas12a sgRNA. The sequence contains a T7 promoter, however unlike the target sequences does not contain a forward M13 primer. Since transcription needs to end abruptly after the spacer sequence, to ensure specific folding of the RNA, requried to form a complex with the Cas13a and Cas12a proteins.
B) Template DNA for Cas13a target, including a T7 promoter and M13 primer binding sites for amplification, via the polymerase chain reaction.
C) DNA sequence to be used as the Cas12a target direclty, including M13 primers binding sites for amplification, via the polymerase chain reaction.
Since the T7 in vitro transcription protocol required ~1 μg of template DNA (Link to Protocol), however synthesising such yields would have quickly exhausted our IDT budget. Therefore, we ordered 250 ng of each oligo. The target sequences contained M13 primer sites, so that single direction polymerase chain reaction could be used to amplify the Cas13a target DNA for use in transcription. Amplification via polymerase chain reaction was carried out, with an agarose gel to confirm DNA of the right length was produced. Subsequent gel extraction and qubit quantification showed yields were low, with all tests yielding below > 0.01 ug mL on a HS DNA qubit. (link to Protocols) Nevertheless T7 in vitro transcription was attempted on IDT stocks, however no RNA was detectable on the HSRNA tape (Agilent Tapestation 4200) most likely because the DNA template concentration was too low.
Therefore in order to increase the concentrated of the DNA template, both targets and sgRNA sequences were re-sythesised, with the design changed to incorporate Type IIS restriction sites allowing for them to be cloned into high copy number plasmids; transformed into DH5α e.coli; plasmid DNA extraction and yield quantified with a qubit, by our PI. [Table 2] All plasmids were successfully extracted, apart from the RD4_a target (#17). (Engineering cycle)
[Table 2] Table denotes concentrations of plasmid templates for; Cas12a sgRNA, Cas13a sgRNA and Cas13a targets. The Cas12a target plasmids concentration are also noted. All concentrations are within a reasonable range of eachother, so 14 μL of plasmid can universally be used in the full transcription protocol.
Since DNA templates for the sgRNA and Cas13a targets are now in a plasmid, a Type IIS endonuclease now needs cleave and linearise the DNA so that transcription ends abruptly, without the need for a terminator sequence that would interfere with the specific RNA folding.
[Figure 2] Abstract representation of DNA sequences synthasised by IDT and cloned into pX1800 plasmids by PI.
A) The sgRNA template is to be cleaved with a Type IIS endonuclease, to linearise the plasmid template. The plasmid template needs to be linearised to cause transcription to end abruptly, without the need for a termination sequence that would interfere with the specific RNA folding. Since the endonuclease would cleave into the spacer sequence, Klenow fragment is also used to blunt the end of the template DNA, to ensure the entire spacer sequence is transcribed.
B) The target sequence for Cas13a also needs to be transcribed, so is also linearised. Although the type IIS endonuclease does not cleave into the target sequence, the Klenow fragment is still used to ensure the plasmid does not reform.
C) Since Cas12a requires a DNA target, the cloned plasmid containing the spacer sequence is used as the target direclty, thus requires no modifications.
Since endonuclease would cleave into the spacer regions on the sgRNA templates, a blunting reaction was carried out with the Klenow fragment. (Link to Protocol) A HSRNA tape – using the Agilent Tapestation 4200 showed promising initial results. [Figure 3] There were some unexpected peaks, caused by the degraded lower marker from the sample buffer, and plasmids that did not cleave, meaning RNA was transcribed all the way to a T1 terminator sequence further around the plasmid backbone.
[Figure 3] First successful transcription results, of Cas13a targets and Cas12a sgRNA, measured on a HSRNA tape (with Agilent Tapestation 4200)
A) Shows tape columns: B1- #8 CXCL8 target, C1 - #9 RGS16 target, D1 – #11 EthA_b sgRNA, E1 – #15 RD4_c. Blue box surrounds bands where desired peaks are observed.
B) Shows the normalised sample intensity graph of #9 RGS16 target. Due to a slightly degraded lower marker, a peak at 45 bp is observable. A 219 bp peak, close to our desired 153 bp (and due to the unreliability of the degraded sample buffer, this is likely our desired product).
C) Shows the normalised sample intensity graph of #15 RD4_c. Due to the degraded lower marker, and the small size of the transcribed fragment, the desired product peak at 38 bp (44 bp expected) is adjoined to the lower marker. The 124 bp peak is likely from the transcription of a plasmid templates that has not been cleaved properly (127 bp expected).
Testing DNA Cleanup:
The DNA template had to be removed in order to get readings on the Agilent Tapestation 4200, achieved by adding a DNase (and equal volume of 50 mM MgCl). However, there were concerns that DNase could interfere with our Cas12a test, as DNase remaining would cleave the probes, leading to a false positive. Thus DNase was compared to a DNase treatment then a AMPure bead nucleotide extraction, thus removing the DNase after all the DNA is digested. (Link to Protocol) [Figure 4]
[Figure 4] Comparing DNase treatment and DNase treatment then using AMPure beads, as methods of removing DNA.
A & C) Compare Cas13a sgRNA transcription clean-up of both methods. Both had a peak at around the desired length, with a peak caused by transcription of plasmids that had not cleaved properly.
B & C) compare Cas13a target transcription clean-up of both methods. Again, with a desired peak in both, and a peak caused by transcription of a plasmid that had not been cleaved properly.
DNA needed to be removed from all transcribed samples, in order to be analysed in the Tapestation. And Cas12a sgRNA samples need to have plasmid templates removed, as these would trigger the Cas system.
Overall, both methods are successful and interchangeable, thus AMPure bead extraction was discarded due to high cost of reactants.
As a DNase treatment, followed by a AMPure bead nucleotide extraction, gave similar results (apart from having a higher concentration) to just a DNase treatment; a DNase treatment, followed by a 15 minute denaturing cycle after incubation, was decided on due to the significant cost of the AMPure beads.
Final Results:
Once our cleanup/purification method had been decided, all sgRNA sequences (apart from #3 - CXCL8 sgRNA, which had already been transcribed when testing the AMPure bead extraction) and Cas13a target sequences were transcribed. [Figure 5]
[Figure 5] Shows the Agilent Tapestation 4200 RNA ScreenTape results from final T7 in vitro transcription. Where; Cas13a sgRNA sequences (1,2,4,5), Cas13a target sequences (6,7,8,9,10) and Cas12a sgRNA sequences (11,12,13,14,15) were tested. Against a contorl, where the T7 polymerase was not included in the reaction mixture. This figure includes an image of the RNA ScreenTape and a representative normalised sample intensity sample graph.
A)Shows the tape gel – showing uniformity among varients of each RNA component.
B)Shows the PLAUR (Cas13a) sgRNA sequence (#1) normalised sample intensity graph. With a desired RNA sequence peak – at 41 bp (a 52 bp peak was expected) – and a 119 bp peak, indicating transcription of template DNA that had not been cleaved. There is also remains of a degraded upper marker from the BR sample buffer.
C)Shows PLAUR Target sequence (#6) normalised sample intensity graph. With a desired RNA sequence peak – at 70 bp (a 153 bp peak was expected) – and a 177 bp peak, indicating transcription of template DNA that had not been cleaved. There is also a large board peak, above 2000 bp, which is likely from environmental contamination.
D)Shows EthA_c sgRNA sequence (#12) normalised sample intensity graph. With a desired RNA sequence peak – at 96 bp (a 44 bp peak was expected) – with a low level of background noise across too 600 bp.
All columns on the RNA tape showed there was RNA present at our around the expected length for sgRNA, and target RNA. Since the sample buffer from the Agilent Tapestation 4200 was out of date, with a degraded upper marker, the size of RNA at each peak is not completely accurate. But, consistently there are strong peaks within the expected range. There are also signs of transcribed RNA, from plasmid templates that did not cleave properly, which terminated further through the pX1800 plasmid at a terminator. There is also signs of environmental contamination in all but one Cas13a target’s, however this was not expected to effect results.
The Agilent Tapestation 4200 gave the following concentrations:
[Table 3] Table denotes concentration of sgRNA and target RNA transcribed in the final scaled up T7 vitro transcription protocol, measured by the Agilent Tapestation 4200 and a qubit HSRNA per RNA class. Cas13a sgRNA ranges between 18.76 – 26.80 ng/μL, with the upper limit being similar to the qubit reading of 30.67 ng/μL. The Cas13a targets have a significantly higher concentration, between 42.20 – 149.60 (with the qubit reading >200 ng/μL), the high levels of enviromental contamination in some of the targets are properly contributing to this. The Cas12a sgRNA has the lowest concentration, ranging between 8.30- 14.00 ng/μL, this could be due to a low amount of DNA template present with DNA contamination causing incorrect extracted plasmid concentration readings.
Again since the sample buffer (with the upper and lower markers) is out of date, the concentrations given were confirmed against a HS RNA qubit, under expert instruction.
Click for References
[1]McLoughlin KE, Correia CN, Browne JA, Magee DA, Nalpas NC, Rue-Albrecht K, et al. RNA-Seq Transcriptome Analysis of Peripheral Blood From Cattle Infected With Mycobacterium bovis Across an Experimental Time Course. Frontiers in Veterinary Science. 2021; 8:662002.
[2]Taylor GM, Worth DR, Palmer S, Jahans K, Hewinson RG. Rapid detection of Mycobacterium bovis DNA in cattle lymph nodes with visible lesions using PCR. BMC Vet Res. 2007 Jun 13; 3:12.
[3]Kellner MJ, Koob JG, Gootenberg JS, Abudayyeh OO, Zhang F. SHERLOCK: nucleic acid detection with CRISPR nucleases. Nat Protoc. 2019 Oct; 14(10):2986-3012.
[4]Sato K, Kato Y. Prediction of RNA secondary structure including pseudoknots for long sequences. Brief Bioinform. 2022 Jan 17; 23(1).
[5]Chen JS, Ma E, Harrington LB, Da Costa M, Tian X, Palefsky JM, et al. CRISPR-Cas12a target binding unleashes indiscriminate single-stranded DNase activity. Science. 2018 Apr 27; 360(6387):436-9.