Software | UTokyo - iGEM 2024

Abstract

Developing and releasing software to make the technologies created in iGEM projects more user-friendly is highly beneficial for promoting future research that involves the utilization, expansion, and improvement of those technologies. Additionally, it can inspire future iGEM projects. UTokyo developed software based on the code used in Dry Lab modeling, which incorporates functions for homologous sequence search and automated sequence design. This software is available on GitLab, and it is expected to facilitate the broader application of the POIROT system to a wider range of targets.

Introduction

In this project, Dry Lab worked on homology search and optimization of sequence design. These efforts aimed to improve and enhance the reliability of the quantification system targeting tear fluid miRNA, a biomarker for glaucoma. However, POIROT is a versatile system that enables the quantification of miRNA from various samples, not limited to glaucoma. Homology search and sequence design are also essential when designing systems targeting different miRNAs or samples other than tear fluid. By making the programs used by the Dry Lab available as software, future iGEM participants and researchers will be able to use POIROT more easily. To this end, a software named "miRNA Detective," implementing both homology search and automated sequence design functions, was developed.

Description

In Dry Lab, homology search and sequence design were primarily conducted using Python, due to the ease of setting up the runtime environment and sharing code. By directly using the Python code developed during the modeling phase when implementing these algorithms into software, the development period can be shortened and the likelihood of bugs reduced. Therefore, miRNA Detective was developed using Flask, a Python-based web application framework. By making it a web application, miRNA Detective can be displayed in a standard web browser, and the environment setup can be completed simply by installing Python and a few libraries. Additionally, the Python code used for data processing is modularized by function, allowing for easy maintenance and expansion.

In miRNA Detective, Bootstrap was used as the frontend framework. This enabled the creation of a clean and visually appealing design with minimal code, while also supporting responsive design that adapts to different screen sizes. The separation between frontend information display and input, and backend processing using Python, was achieved using Ajax. This allows for dynamic updates of the on-screen elements after user input is processed by the Python program, displaying the results seamlessly without page transitions.

How to Use It

Install

The source code for miRNA Detective is available on GitLab, and to use the software, users must follow the steps below to run it on their own computers.

First, Python needs to be installed on your computer. If it is not installed, refer to Python.org and obtain version 3.12 or later.
On Windows, open PowerShell, and on Linux or Mac OS, open Terminal, and navigate to the directory where you want to place the program using the cd command.

Run the following command to copy the repository to your computer:

git clone https://gitlab.igem.org/2024/software-tools/utokyo.git software_utokyo2024

Navigate to the software_utokyo2024 directory where the program is stored using the cd command:
```
cd software_utokyo2024
```
After updating pip to the latest version, install the required Python libraries to run the software:
```
pip install --upgrade pip
pip install -r dependencies.txt
```
Run main.py with the following command:
```
py main.py
```
Open http://127.0.0.1:8080 in your browser.

if the startup menu appears as described, the installation was successful!

To stop running the software, press Ctrl+C in PowerShell or Terminal to stop the process that is holding the port open.

Startup Menu

When miRNA Detective starts up, the startup menu opens, displaying two icons: "Similar Sequence Search" and "Sequence Design." By clicking on each icon, you will be taken to the execution screen for that respective content. Additionally, after transitioning to the respective screens, you can return to the startup menu by clicking the POIROT logo in the header.

hogehoge — *Figure 1. Startup menu of miRNA Detective.*Clicking on the icons will take you to each function.

Similar Sequence Search

In "Similar Sequence Search," you can search for homologous sequences in various samples. By entering four fields—"Target miRNA," "Sample Type," "Minimum Subsequence Length," and "Similarity Score"—and pressing the "Search" button, the miRNA homology search begins, displaying the input information and a loading spinner. Once the search is complete, the loading spinner disappears, and the results are displayed in order of highest similarity. The results include the names of the discovered similar miRNAs, their nucleotide sequences, similarity scores, and the corresponding subsequences and their nucleotide positions for both the target and similar sequences. For more details on how the sequence search program works and the role of each input field, refer to the MiRNA Selection in the Model section.

In the "Target miRNA sequence" field, you should input the sequence of the miRNA for which you want to search for homologous sequences. This input field is designed for miRNA sequences consisting of approximately 20 to 25 nucleotides made up of A, U, G, and C. If a longer nucleotide sequence or a DNA sequence is entered, the search may not function correctly.

In the "Sample Type" field, you specify the database to be used for the search for homologous sequences. If "All" is selected, the search will target a database created by extracting human miRNAs from miRBase Release 22.1 ¹, which contains a large number of reported miRNA sequences. When "Aqueous Humor," "Blood Plasma," or "Leukocyte" is selected, the search will refer to papers that have conducted comprehensive profiling of miRNAs present in each sample. The database is constructed by extracting human miRNAs and their sequences from miRBase that share the same names as those found in the paper's data. Details of each database are as follows:

Aqueous Humor: This database includes 1,493 types of miRNAs that have been confirmed to be present in the aqueous humor, extracted from a total of 1,623 types reported in the paper ², all of which are listed in miRBase under the same names.
Blood Plasma: This database includes 2,523 types of miRNAs that have been confirmed to be present in blood plasma, extracted from a total of 2,576 types reported in the paper ³, all of which are listed in miRBase under the same names.
Leukocyte: This database includes 2,510 types of miRNAs that have been confirmed to be present in leukocytes, extracted from a total of 2,550 types reported in the paper ⁴, all of which are listed in miRBase under the same names.

The “Minimum Subsequence Length" specifies the minimum length of the subsequence used for similarity evaluation, which can be set within the range of 1 to 30. If the specified subsequence length exceeds that of the input target sequence, the similarity search will not be executed correctly.

In the “Similarity Score" field, you can specify the method for evaluating sequence similarity using radio buttons, choosing either the percentage of sequence matches or the number of mismatches. Additionally, for the selected evaluation method, you can set a minimum value for the percentage of sequence matches and a maximum value for the number of mismatches using sliders. Only those similarity evaluation values that meet these criteria will be displayed as similar sequences.

Sequence Design

In “Sequence Design", users can automatically design the sequences of nucleic acids for signal amplification that are incorporated into the miRNA quantification system. By entering the nucleotide sequence of the target miRNA in the “Target miRNA sequence" field and clicking the “Design" button, the program will output the helper and template sequences for TWJ-SDA. The sequence design program also expects input of miRNA sequences that are approximately 20 to 25 nucleotides long, similar to the similar sequence search. Therefore, it may return an error for extremely short or long sequences, or for sequences that are not RNA. For more details on the sequence design program, please refer to the Sequence Design under the Model section.

Conclusion & Future Prospect

A software specifically designed for POIROT in the Dry Lab, called “miRNA Detective," has been developed. This software incorporates functions for similar miRNA search and automatic sequence design, and it is expected to facilitate the application of POIROT to various diseases and specimens.
In the similar sequence search, three types of specimens—aqueous humor, blood plasma, and leukocytes—can be specified. However, miRNAs are expressed in various parts of the human body, and the types of specimens that may contain miRNAs used as disease biomarkers are not limited to those currently implemented in the software. The similar sequence search program has the flexibility to easily add other specimens to the search targets by preparing a list of miRNA sequences described in FASTA format. Therefore, it is expected that additional specimens will be incorporated in the future, along with updates to the information, and that users will be able to add custom databases tailored to their specific needs.

Currently, the code is only publicly available on GitLab, and users need to download it to their own computers and run it in a local environment to utilize the software. This process requires a certain amount of effort, such as operating via the CUI, and it may be difficult to execute depending on the individual local environment. On the other hand, the software is implemented with a separation between the front-end and back-end, using Ajax for communication. Therefore, if hosted on a network server, it would also be possible for users to access the software online. If this is realized, it is expected that a wider variety of users will be able to use the software more conveniently, contributing to the further expansion of POIROT's applicability.

References

Faculty of Biology, Medicine and Health, The University of Manchester. miRBase. https://www.mirbase.org
Tanaka, Y., Tsuda, S., Kunikata, H. et al. (2014). Profiles of extracellular miRNAs in the aqueous humor of glaucoma patients assessed with a microarray system. Sci Rep 4, 5089. https://doi.org/10.1038/srep05089
Suzuki, K., Yamaguchi, T., Kohda, M. et al. (2022). Establishment of preanalytical conditions for microRNA profile analysis of clinical plasma samples. PLoS One. 17 (12), e0278927. https://doi.org/10.1371/journal.pone.0278927
Kang, K., Shen, Y., Zhang, Q. et al. (2022). MicroRNA Expression in Circulating Leukocytes and Bioinformatic Analysis of Patients With Moyamoya Disease. Frontiers in Genetics, 13, 816919-816919. https://doi.org/10.3389/fgene.2022.816919