Our Sponsors
Geneious, Thomas Scientific, Integrated DNA, New England Bio, Aether Bio, and Dean of Students and Dean of Engineering at UCSC
Contact Us
1156 High St, CA 95060, US
Search and click on the underrepresented sequences to learn more about Stealth!
To accelerate the transformation and integration of Cyanobacteria, LiFT used multiple methods to maximize integration efficiency.
BLACKBIRD contributes the use of the Stealth program to optimize our gene inserts to transform as efficiently as possible.
BLACKBIRD automates the process of altering gene inserts with regard to RMS cut sites, beginning with the processing of the host and target genomes.
BLACKBIRD uses the genomes of the host organism, the organism from where the gene insert is derived, and the target organism to optimize the gene insert.
It calculates the codon usage table for the genomes. This is used to generate a ranking table between the host and target organism, so that when codon optimizing the sequence, codons are used in accordance with their relative abundance. This is done to preserve slow translating regions of proteins, shown to increase the folding
efficiency of proteins as the time taken for the rarer tRNA to bind promotes proper protein folding [3][4]. The codon usage table is
created through the use of a Open Reading Frame finder coded into the program, and generates the usage statistics based off of the frames found. When these are completed, the codon tables of both organisms are compared, matching codons to amino acids based on usage rankings between organisms, and the gene insert can be altered.
The process of adapting the gene insert is by far the most time-intensive process, specifically due to the chance of generating new RMS cut sites with each edit. Our
implemented solution created an 'editing window' that would take into consideration the following and trailing sequence to identify if any changes generated new cut sites.
Using the targetted organism’s genome, BLACKBIRD runs the Stealth program and returns the underrepresented theoretical RMS cut sites that are to be checked against the
insert. An editing window is created at every RMS cut site found in the gene insert, and the adapting process begins. The window is based on the first codon containing the RMS cut site, and extends 2 codons before and 2 after. This was chosen based on the K-mers being 4 nucleotides long at a minimum, and a 5 codon window being best suited to
catching any changes. With each change, BLACKBIRD rechecks the window for any newly generated cut sites until none are found. In the event that the RMS cut site cannot be
removed from the current codon, the window shifts forward to the next codon and attempts changes
A class created to open and read FASTA files. Operates by taking a file name as an argument.
Initiates by saving the name of the file given to it to be read.
Opens the file provided, and if none provided, uses STDIN.
Creates a generator that reads the FASTA file line by line, yielding the header and sequence of each entry.
Class that reads the contents of the outputs from Stealth, and converts its contents into a list. Adapts the IUPAC nomenclature into multiple sequences.
Finds IUPAC codes and saves their positions on a string.
Generates all possible combinations of the IUPAC codes found in the input string.
Returns the sequences generated by a IUPAC nucleotide.
Reads a text file with Stealth hits and deciphers the IUPAC conventions.
A class that creates a sequence with the optimal codon choices. Is recursively called to edit a target site until it exits the site.
Initiates the class by saving the host and target codon usage tables.
Reads the host codon usage table and returns it as a dictionary.
Returns the frequency of a codon in the host organism.
Reads the target codon usage table and returns it as a dictionary.
Returns the frequency of a codon in the target organism.
Checks if the codon is the same in both organisms.
Replaces the codon with the optimal codon choice. Iterates through every possible choice.
Function that alters and confirms the absence of a stealth hit in a given subsequence. Operates the 5 codon window.
Main functions that optimize inserts based on host and target codon biases and its relevant classes.
Initiates the class by saving the insert, stealth, host, and target files, and the output file.
Sets a temporary 'start' position for a chosen subsequence. Part of prepping the 5 codon window.
Sets a temporary 'end' position for a chosen subsequence. Part of prepping the 5 codon window.
Builds and returns the subsequences or windows to be optimized.
Formats the final sequence with all the altered sequences from CodonUsage.altSeqMaker().
Finds the stealth hits in the insert sequence and returns a list format.
Runs the entire optimization process.
Main that runs the Command Line Interface and executes the program.
Class that builds the codon usage tables used throughout the program.
Initiates the class by saving the genome file.
Finds the start codon in the sequence.
Finds the end codon in the sequence for an ORF.
Finds the Open Reading Frames in the genome.
Counts the codons in each ORF.
Calculates the percentage usage of each codon.
Creates and formats the dictionary.
[1] V. Nandakumar, A. Mahesh BLACKBIRD GitLab iGEM UCSC 2024.
[2] S. Hu, "Altering under-represented DNA sequences elevates bacterial transformation efficiency" mBio, Oct. 31, 2023. https://doi.org/10.1128/mbio.02105-23 (accessed Sep. 24, 2024).
[3] G. Zhang, "Transient ribosomal attenuation coordinates protein synthesis and co-translational folding" Nat Struct Mol Biol, Jul. 13, 2008 https://www.nature.com/articles/nsmb.1554 (accessed Sep. 26, 2024).
[4] G. L. Rosano, "Rare codon content affects the solubility of recombinant proteins in a codon bias-adjusted Escherichia coli strain" Microb Cell Fact, Jul. 24, 2009 https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2723077/ (accessed Sep. 26, 2024).