Prokaryotic (Motif-Based) Promoter (Strength Comparison) Analysis Program
This section is currenty under development. We are currently working on supporting the folllowing GUI with our biopython backend. Look at the input, upload and output sections below to get an overview of what the tool will look like with a completed GUI. You can alternatively use the code and detailed instructions available on our GitLab page: https://gitlab.igem.org/2024/software-tools/dku
You will not get any output if you run the analysis right now since this GUI is under development, but you can clone our GitLab repository to use the tool right now. Detailed instructions are available below.
Our GitLab page (https://gitlab.igem.org/2024/software-tools/dku ) contains our source code used to build the program along with detailed instructions on how to use the program to conduct your own analysis. For your convenience, we have given detailled instructions below detailing how you may use the program after cloning our repository to rank promoters in order of most suitable to express your gene of interest in a prokaryotic organism of choice.
Before installing the program, please ensure you have the following: Operating System: Windows, macOS, or Linux. Python Version: Python 3.8 or higher. Package Manager: pip (usually included with Python installation).
This program requires the following Python packages: Biopython: A toolkit for biological computation. Collections and itertools: Standard Python modules used for handling sequences and counters.
Clone the Repository First, clone the repository from GitLab to your local machine. You can do this by running:
git clone https://gitlab.com/dku-2024/ppap.git
cd ppap
Create a Virtual Environment (Recommended)
It is recommended to create a virtual environment to keep the dependencies organized. To create a virtual environment:
python3 -m venv ppap-env
Activate the virtual environment:
On Windows:
ppap-env\Scripts\activate
On Mac:
source ppap-env/bin/activate
Installl dependencies After activating the virtual environment, install the required dependencies. You can do this by running:
pip install -r requirements.txt
orpip install biopython
Running the Program
The program can be run by executing the main Python script:
python ppap.py
Inputting the data
Next, you must visit RegPrecise Database at https://regprecise.lbl.gov/. Navigate to Manually Curated > Regulon Collections> Taxonomy. Then, find your relevant taxonomic group. Then, choose your species (Genome). Scroll down to the end of the page. In the export section, download the Regulatory Sites for your genome of interest in FASTA Format as a text file.
The program expects an input data file containing the regulatory sites in a FASTA format. Make sure the file is available at the path defined in the script (file_path).
Update the file_path variable in the script to point to your own data file:
file_path = '/path/to/your/data/file.txt'
Enter the promoters as a list in the following format by replacing the relevant part in the python script: **EXAMPLE
sequences = sequences = {
'nisA': 'CTAGTCTTATAACTATACTGACAATAGAAACATTAACAAATCTAAAACAGTCTTAATTCTATCTTGAGAAAGTATTGGTAATAATATTATTGTCGATAACGCGAGCATAATAAACGGCTCTGATTAAATTCTGAAGTTTGTTAGATACAATGATTTCGTTCGAAGGAACTACAAAATAAATTAT',
'promoter2':
'sequence'
}
Replace nisA with your own Promoter name and the sequence with that specific promoter's sequence.**
Example input in python script:
For Promoter Sequences:
sequences = sequences = {
'nisA': 'CTAGTCTTATAACTATACTGACAATAGAAACATTAACAAATCTAAAACAGTCTTAATTCTATCTTGAGAAAGTATTGGTAATAATATTATTGTCGATAACGCGAGCATAATAAACGGCTCTGATTAAATTCTGAAGTTTGTTAGATACAATGATTTCGTTCGAAGGAACTACAAAATAAATTAT',
'nisA_op': 'CTGGTTCTGTAACTGTACTAACAGTAAAAGCACTAACAGATCTAAAACAGCCTGAACAGCATCCTGCGGAAGTACTGGTAATAATACTACTGCCGGTAACGGGAACACAACAAGCGGTTGTAACTGAACAGCGAAGTCTGCTAAATCCAGTAATTTCGGAGCAAGGAACTGCAGAACAAGCTG',
'nisF':'ATTTAGTAATCTCTAAGGATTACTTTTTTTGTTTCTGAATAGATTCTGAAAATTGTTTTATATACTTTTTTTAAACATAAAATAAAGTGAGGAAATATA',
'nisF_op':'AATCTTGTTATTAGCAAGGACTATTTCTTCTGCTTTTAGATCGACAGTGAGAACTGCTTCATCTATTTTTTTTGAACTTAAAACAAGGTTAGAAAGTAT',
'nisR':'AGATTATATTTCTTCAGAATGAATGGTATAATGAAGTAATGAGTACTAAACAATCGGAGGTAAAGTG',
'nisR_op':'AGATTATATTTTTTCAGAATGAATGGAATAATGAAGTGATAGGTGTTGAACAATCGAAGATAGAGT',
'P3':'GCACTGATTCTTTTTTCAGTGCTTTTTTTTATAAAAATTGGACAAAATAGACATCTTGTCTAAAAACTATTTCAGATTCTTGCAATCATTTTCTTCTTGTACTATGATGAAATTAGTAAGAGATTGGAGGAGAATTAGAAGTAAAGAGACTAAAAAATTTAGTCAA',
'P5':'GAAAAAGAAAATGTTTTTGTATTTTTAGAATCCCTTTTCTATAAATCAATTCTAATTATAAGGACCTGATGATTGAGTGATAATGCTAGTTTGAAGCATTGTTAGTAAGAAAGTGATTTTTTATAAATGGTTTATAGAATAAATTGTACAGCGTTTAATTGGACTTGCTCTCTGAAATAACTAAAATTGTAGTGAGGACGACGGTTACA',
'P8':'GATAAAATTTCTAATGATTTTTTAGGACAATTATTTCTCATAAAAAGCAGATTTTAGAAAGAAAATTGTATTTTTTTAACAGCTTTGACTGCCCTTTTTGGAAGAGTTTATGTATAATAGAATTAGTTAGTTTTGCTATTGATATAGCAGCAGAAATGGAGAGATATA',
'P11':'AGATCTAGCGCTATAGTTGTTGACAGAATGGACATACTATGATATATTGTTGCTATAGCG',
'P48':'AGATCTGCATCGTAAGTTGTTGACATGGAACGAGGAATGTGATAATCTGTGAGTATAGCG',
'PTCIIC-celA':'aacttatatgacaattttggtacaggagtcttcaaaagtggcacagaaccaaagtgatggaaaaataagaaactgcTTGCTTtacttgcctattaatgcTATAATgaaaatgtagaaaagatggacgtgaaaccagttcatcaaaaaaagtaaaggagactgttcaacc',
'P32':'agattaatagttttagctattaatctttttttatttttatttaagaatggcttaataaagcggttactttggatttttgtgagcttggactagaaaaaaacttcacaaaatgctatactaggtaggtaaaaaaatattcggaggaattttgaa',
}
For Regulatory Sites:
file_path = '/Users/shabanmuhammad/Desktop/iGEM/Promoter_Strength_Analysis/ExportServlet.txt'
Example Output:
Promoters ranked by custom score:
nisA_op: 13.28
P48: 12.50
P11: 11.00
nisF_op: 9.39
nisR: 8.96
nisR_op: 8.94
P5: 8.77
P3: 8.73
nisA: 8.32
P8: 8.28
nisF: 6.36
PTCIIC-celA: 0.36
P32: 0.00