Throughout our project, we reflected on the knowledge and advice that would have been beneficial to know when conducting our iGEM project. As a result, we created YouTube videos covering routine laboratory protocols and essential synthetic biology concepts. Our aim is to equip future iGEM teams with the knowledge and tools needed to maximise their potential. Furthermore, we contribute the detailed protocol and theory for a cheap, benchtop enzymatic assay for (early) lanthanide quantification, which we expect will be useful to iGEM teams working with REEs – as has been the case for iGEM teams RareCycle (Aachen 2023), Hizju China (2024), and Neocycle (Calgary 2021).
We filmed videos documenting the laboratory protocols we implemented during our project, to assist future iGEM teams in their lab work.
Not only did we film lab protocols, but we also created educational videos covering fundamental topics in Synthetic Biology projects such as cloning, PCR, and understanding plasmids, which we hope future iGEM teams will find to be a valuable resource. We are committed to promoting it to them as a tool for their success.
We filmed videos on the following topics:
Many chemicals and elements can only be quantified by expensive physical methods, such as mass spectrometry techniques, or by non-specific benchtop techniques. In general, it is beneficial to generate benchtop chemical assays for these compounds, to allow for cheaper, highly facile and quick results. Unfortunately, for many chemicals there are no appropriate chemical assays, because they share many chemical properties with other common contaminants. Molecules in biological samples, e.g. blood samples, are often especially difficult to assay because in biology there are very many contaminants with very many functional groups that may interfere. Another class of chemicals that are difficult to assay specifically with chemical methods are metals – many metals share similar properties (charge, radii), and so cannot be easily distinguished.
For these poorly distinguishable classes of chemicals, enzymatic assays are usually preferred, because enzymes demonstrate excellent selectivity even often between highly similar chemicals. However, enzymatic assays are limited to those chemicals which can be used as substrates for an enzymatically-catalysed reaction, which sometimes requires extensive protein engineering; and is sometimes entirely unfeasible.
We present the theory for quantification of chemicals that are used by enzymes as cofactors, which is especially pertinent to metal ion quantification, and demonstrate it by generating an assay for Neodymium concentration, which we require for our project.
Traditional in vitro benchtop assays for Neodymium concentration use the metallochromatic dye Arsenazo III, but this is inappropriate for in vivo assays, as Arsenazo III is also calcium sensitive, and calcium is usually required at higher concentrations than Nd, in order for the cells to grow. Our assay therefore uses the lanthanide-selective methanol dehydrogenase (MDH) XoxF.
A robust and widely accepted assay for measuring rate of PQQ MDHs exists in the literature, and has previously been used to characterise the MM parameters of XoxF, as in Anthony and Zatman (1964) and Huang et al (2018). The assay does not measure the rate of methanol removal or formaldehyde production, but instead uses a chain of redox reactions ending in a redox dye. PQQ is the redox cofactor that XoxF uses to oxidise methanol; PMS (phenazine methosulfate) is a cytochrome mimic that can reoxidise PQQ; and reduced PMS can reduce DCPIP, which is blue when oxidised and colourless when reduced, as by the following scheme:
The redox dye DCPIP is widely used in measuring photosynthetic pathways, and is as a result well characterised. However, we still recommend performing a calibration curve to derive its molar extinction coefficient, because it varies with physical conditions, especially buffer pH.
Rate can be be defined as the rate of DCPIP reduction, which can be measured spectophotometrically at DCPIP’s peak absorbance 600nm. Using the MM-derived equation from earlier, we can see that is possible to generate reactions with constant rates by using an excess of methanol. As a result, the data can easily be manipulated into a calibration curve of rate against [Nd], as below:
There are a number of reagents required for the assay described above – the scheme below is an optimised setup, in which the reaction runs to completion over a minutes timescale, which is neither too short to measure accurately nor so long that it is inconvenient. Details of unusual reagents can be found in Jahn et al (2020).
In order to facilitate future work with the AM1 strain of M. extorquens, we performed a statistical analysis of its genomic data to produce a codon usage frequency table:
U | C | A | G | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
U | UUU | Phe | 0.105 | UCU | Ser | 0.023 | UAU | Tyr | 0.250 | UGU | Cys | 0.136 | U |
UUC | 0.895 | UCC | 0.260 | UAC | 0.750 | UGC | 0.864 | C | |||||
UUA | Leu | 0.003 | UCA | 0.052 | UAA | Stop | 0.111 | UGA | Stop | 0.639 | A | ||
UUG | 0.042 | UCG | 0.324 | UAG | Stop | 0.250 | UGG | Trp | 1.000 | G | |||
C | CUU | Leu | 0.132 | CCU | Pro | 0.037 | CAU | His | 0.356 | CGU | Arg | 0.117 | U |
CUC | 0.461 | CCC | 0.311 | CAC | 0.644 | CGC | 0.483 | C | |||||
CUA | 0.008 | CCA | 0.120 | CAA | Gln | 0.108 | CGA | 0.098 | A | ||||
CUG | 0.354 | CCG | 0.532 | CAG | 0.892 | CGG | 0.275 | G | |||||
A | AUU | Ile | 0.081 | ACU | Thr | 0.035 | AAU | Asn | 0.208 | AGU | Ser | 0.033 | U |
AUC | 0.849 | ACC | 0.562 | AAC | 0.792 | AGC | 0.308 | C | |||||
AUA | 0.070 | ACA | 0.031 | AAA | Lys | 0.149 | AGA | Arg | 0.007 | A | |||
AUG | Met | 1.000 | ACG | 0.372 | AAG | 0.851 | AGG | 0.019 | G | ||||
G | GUU | Val | 0.128 | GCU | Ala | 0.077 | GAU | Asp | 0.422 | GGU | Gly | 0.164 | U |
GUC | 0.482 | GCC | 0.495 | GAC | 0.578 | GGC | 0.605 | C | |||||
GUA | 0.096 | GCA | 0.048 | GAA | Glu | 0.314 | GGA | 0.077 | A | ||||
GUG | 0.294 | GCG | 0.379 | GAG | 0.686 | GGG | 0.154 | G |
generate-codon-table.jl
Usage: julia generate-codon-table.jl FOLDER
Where FOLDER is a path to a folder containing sequence.fasta (the DNA sequence of the gene) and sequence.gb (GenBank file containing translations).
using GenomicAnnotations
using Dictionaries
folder = ARGS[1]
# Amino acid => [Count, [Associated codons]]
codonMap = Dictionary(Dict([
'F'=>[0, ["TTT", "TTC"]],
'L'=>[0, ["TTA", "TTG","CTT", "CTC", "CTA", "CTG"]],
'S'=>[0, ["TCT", "TCC", "TCA", "TCG", "AGT", "AGC"]],
'Y'=>[0, ["TAT", "TAC"]],
'*'=>[0, ["TAA", "TAG", "TGA"]], #Stop codon
'C'=>[0, ["TGT", "TGC"]],
'W'=>[0, ["TGG"]],
'P'=>[0, ["CCT", "CCC", "CCA", "CCG"]],
'H'=>[0, ["CAT", "CAC"]],
'Q'=>[0, ["CAA", "CAG"]],
'R'=>[0, ["CGT", "CGC", "CGA", "CGG", "AGA", "AGG"]],
'I'=>[0, ["ATT", "ATC", "ATA"]],
'M'=>[0, ["ATG"]],
'T'=>[0, ["ACT", "ACC", "ACA", "ACG"]],
'N'=>[0, ["AAT", "AAC"]],
'K'=>[0, ["AAA", "AAG"]],
'V'=>[0, ["GTT", "GTC", "GTA", "GTG"]],
'A'=>[0, ["GCT", "GCC", "GCA", "GCG"]],
'D'=>[0, ["GAT", "GAC"]],
'E'=>[0, ["GAA", "GAG"]],
'G'=>[0, ["GGT", "GGC", "GGA", "GGG"]],
]))
# [Codon, probability]
probabilities = Dictionary(Dict([
["TTT", 0.0], ["TTC", 0.0], ["TTA", 0.0], ["TTG", 0.0],
["TCT", 0.0], ["TCC", 0.0], ["TCA", 0.0], ["TCG", 0.0],
["TAT", 0.0], ["TAC", 0.0], ["TAA", 0.0], ["TAG", 0.0],
["TGT", 0.0], ["TGC", 0.0], ["TGA", 0.0], ["TGG", 0.0],
["CTT", 0.0], ["CTC", 0.0], ["CTA", 0.0], ["CTG", 0.0],
["CCT", 0.0], ["CCC", 0.0], ["CCA", 0.0], ["CCG", 0.0],
["CAT", 0.0], ["CAC", 0.0], ["CAA", 0.0], ["CAG", 0.0],
["CGT", 0.0], ["CGC", 0.0], ["CGA", 0.0], ["CGG", 0.0],
["ATT", 0.0], ["ATC", 0.0], ["ATA", 0.0], ["ATG", 0.0],
["ACT", 0.0], ["ACC", 0.0], ["ACA", 0.0], ["ACG", 0.0],
["AAT", 0.0], ["AAC", 0.0], ["AAA", 0.0], ["AAG", 0.0],
["AGT", 0.0], ["AGC", 0.0], ["AGA", 0.0], ["AGG", 0.0],
["GTT", 0.0], ["GTC", 0.0], ["GTA", 0.0], ["GTG", 0.0],
["GCT", 0.0], ["GCC", 0.0], ["GCA", 0.0], ["GCG", 0.0],
["GAT", 0.0], ["GAC", 0.0], ["GAA", 0.0], ["GAG", 0.0],
["GGT", 0.0], ["GGC", 0.0], ["GGA", 0.0], ["GGG", 0.0],
]))
# Counts occurence of each codon in the genome file
function countCodons!()
total = 0
genomeFile = open("./$(folder)/sequence.fasta", "r")
gbdata = readgbk("./$(folder)/sequence.gb")
# Import sequence.fasta as a string
genomeStr = ""
for line in readlines(genomeFile)
if (first(line, 1) != ">") # Ignore > lines
genomeStr = string(genomeStr, line) # Append line to string
end
end
# For each chromosome in gbdata
for chr in gbdata
# For each gene
for gene in @genes(chr, !ismissing(:translation))
# Split DNA into codons, and assign relevant amino acid
dna = String.(Iterators.partition(genomeStr[locus(gene).position], 3)) # Split into 3s
for i in 1:length(dna) # For each codon
total += 1 # Increment codon counter
codon = dna[i] # Get codon from DNA sequence
# Check codon valid
if (i < length(gene.translation) || codon in keys(probabilities))
# Lookup amino acid from codon table or translation sequence
aminoacid = i > length(gene.translation) ?
getAminoAcid(codon) : gene.translation[i] # Use translation for efficiency if available
# If mismatched amino acid, log warning
if (i < length(gene.translation))
if (aminoacid != gene.translation[i])
println("WARNING: $(aminoacid) != $(gene.translation[i]) for codon $(codon) in $(chr.name)")
end
end
if (aminoacid != 'U') # Ignore artificial amino acids
probabilities[codon] += 1 # Increment codon count
# codonMap[aminoacid][1] += 1 # Increment amino acid count
end
end
end
end
end
close(genomeFile)
println("Total number of codons: ", total)
end
function getAminoAcidValues(codon)
filter(((aminoacid,aminoacidvalues),) -> codon in aminoacidvalues[2],
pairs(codonMap))
end
function getAminoAcid(codon)
getAminoAcidValues(codon).values[1][1]
end
function getAminoAcidCount(codon)
getAminoAcidValues(codon).values[1][2][1]
end
function calculateAminoAcidCounts!()
# For each codon associated with each amino acid
for aminoacid in keys(codonMap)
for codon in codonMap[aminoacid][2]
# Add the count to the total
codonMap[aminoacid][1] += probabilities[codon]
end
end
end
function calculateProbabilities!()
calculateAminoAcidCounts!()
# For each codon, divide the codon count by the amino acid count
map(((codon,p),) -> p / getAminoAcidCount(codon),
pairs(probabilities))
end
countCodons!()
println("Codon | Frequency")
display(calculateProbabilities!())