The primary objective is to identify relationships between diseases and specific proteins and to develop a receptor capable of binding to these proteins. Initially, this involves collecting and analyzing existing associations between diseases and proteins to construct a preliminary disease-protein interaction network. Subsequently, the scope of diseases of interest and the types of proteins to be explored are defined, with particular emphasis on establishing the desired characteristics of the target receptor, such as binding affinity, specificity, and stability. At this stage, a comprehensive model architecture is devised, and appropriate feature engineering techniques are selected to establish a data-driven analytical framework.
Build
During the build phase, network biology approaches and natural language processing (NLP) are employed to perform text mining on articles from prestigious journals such as Nature and Cell to uncover potential relationships between diseases and proteins. A data processing pipeline is developed to extract disease-protein associations using NLP techniques. Concurrently, pretrained machine learning models are utilized to generate initial protein sequences, which serve as candidate receptors for binding to the target proteins.
Test
The discovered relationships between diseases and proteins are rigorously validated to ensure that the identified proteins are credible disease biomarkers. This verification process involves secondary literature reviews to confirm the association between the identified proteins and the diseases in question. In terms of receptor validation, molecular docking simulations are conducted to assess the binding affinity of the generated receptor sequences with the target proteins. Additionally, sequence alignment is performed to evaluate the similarity of the generated sequences with known functional receptors, ensuring structural and functional compatibility with the target proteins.
Learn
The outcomes from the testing phase are analyzed to identify and summarize the strengths and weaknesses of the model. Should inaccuracies be detected in the disease-protein associations, adjustments are made to the text mining algorithms or the literature dataset is expanded. If the generated receptor demonstrates suboptimal performance in binding assays, the receptor generation model is refined, or the sequence is optimized accordingly. Insights gained during this phase also guide the refinement of the disease database, ensuring that it exclusively comprises disease names, thus avoiding the inclusion of symptoms, to maintain data purity and analytical accuracy.
Cycle 2 - Pretest
Design
From the 2015 NCTU Formosa Team, we gained insight into using Lpp-OmpA to express desired protein on the cell surface.[1]
To verify this system, we designed a biobrick comprising Lpp-OmpA-GS Linker (we will refer to it as the “BELO system” from here on) fused to GFP-His. GFP was included to confirm surface expression via fluorescence, and the His tag was for later use with Anti-His antibodies in ELISA.
The gene encoding Protein G was fused downstream of the BELO system to enable its surface expression on the bacterial outer membrane. Given that Protein G has an affinity for antibodies, its function was verified through ELISA by adding antibodies, and observing the changes in color and absorbance.
Build
We cloned the inserts, BELO-GFP, into the pSB1C3 vector and then transformed them into E. coli BL21 C41 cells. The reason for choosing pSB1C3 vector is that it is a well-known high-copy-number plasmid carrying a chloramphenicol resistance, allowing us to select for cells containing our inserts.[2]E. coli expressing BELO-Protein G was obtained from the team’s primary PI laboratory and used in ELISA experiments.
Test
ELISA
To test whether our devices can capture antibodies or related substances, we used two types of enzyme-linked immunosorbent assay (ELISA) to confirm the system. The first was a direct ELISA using protein G, where different concentrations of Goat Anti-Rabbit conjugated with HRP were added and allowed to react with 3,3',5,5'-tetramethylbenzidine (TMB). The second was sandwich ELISA with a slight modification. Since the His-tag was expected to be expressed on the cell surface, we coated the plate with E. coli instead of capturing antibodies. The secondary antibody used was Anti-His conjugated with HRP, and TMB was used again as the chromogenic substrate.
Figure 3 shows that the OD630 values increased proportionally with the concentration of Goat Anti-Rabbit.
After confirming the temperature effects from the literature[3], we conducted the test at 16°C for 18 hours. We evaluated the expression of BELO-GFP with 6X His tag on the outer membrane. To achieve this, we conducted a comparative analysis using sandwich ELISA between E. coli with only the pSB1C3 alone (containing GFP) and E. coli containing BELO-GFP-His.
We hypothesized that successful expression and functional integration of BELO-GFP with 6X His on the outer membrane would result in ELISA outcomes that positively correlate with the Anti-His antibody concentration, exhibiting a more significant signal compared to the control bacteria containing only the vector.
Figure 4 illustrates the significant differences in ELISA results at OD630 between E. coli strains with and without the BELO system. At each concentration of Anti-His antibody, E. coli expressing the BELO system exhibited higher OD630 values compared to the control strain. This indicates that the BELO system effectively facilitates the expression of the target protein on the outer membrane, enabling specific molecular binding.
Learn
In this round of experiments, we confirmed that the BELO system successfully anchored proteins, such as protein G, to the outer membrane, with only minor limitations regarding protein solubility.[4]
Cycle 3 - Predicted Protein
Design
After verifying that the BELO system can display proteins on the cell surface, we moved on to the adhesion G protein-coupled receptor CD97[5][6], which our model selected for its potential in rapid detection. We designed a biobrick to express its ligand, complement decay-accelerating factor CD55[7], with the aim of applying it to the detection of Acute Myeloid Leukemia. This construct also serves as our positive control to ensure the reliability of our detection system in this round of experiments.
Additionally, we designed a biobrick for CD97-His to be used in the modified sandwich ELISA. In this setup, CD97-His will bind to CD55 expressed on E. coli via its CD97 domain, while Anti-His conjugated with HRP will bind to the His-tag for detection.
However, not all biomarkers have receptors that can bind to them. To address this, we used PeptiMap to predict short peptide sequences that could potentially bind to CD97, followed by local alignment with the known CD55 receptor. We then utilized iGEMDOCK to simulate the binding positions and modes between CD97 and the peptides. Through this process, we identified two peptide sequences, 2BOU-2 and 2BOU-3. These were designed to be fused with Lpp-OmpA using a GS linker, resulting in BELO-2BOU-2 and BELO-2BOU-3. The constructs were cloned into the pSB1C3 plasmid and expressed on the outer membrane of E. coli. An ELISA assay was then performed to validate the binding functionality of these constructs with CD97-His.
Build
We cloned four inserts, BELO-CD55, CD97-His, BELO-2BOU-2, and BELO-2BOU-3 into the pSB1C3 plasmid and transformed them into E. coli BL21 C41.
Test
The goal of our ELISA experiment is to assess the receptor's ability to bind to the target protein. After coating the target receptor and performing blocking, different concentrations of CD97-His are added. This is followed by the addition of an enzyme-labeled secondary antibody, Anti-His-HRP, and the enzyme substrate TMB for color development. Finally, OD630 is measured to verify the binding capacity of the target receptor to CD97-His.
We performed a sandwich ELISA to test whether CD55, 2BOU-2 and 2BOU-3 were successfully anchored to the cell membrane at 16°C for 18 hr.
The bar chart illustrates that the positive control exhibits the highest OD630 value, while the negative control shows the lowest OD630 value. This indicates that both 2BOU-2 and 2BOU-3 effectively bind to CD97-His.
Comparing the ELISA results of 2BOU-2 and 2BOU-3 suggests that 2BOU-3 has a better affinity for binding with CD97-His. Consequently, we chose BELO-2BOU-3 to test the binding capacity with different CD97-His dilution ratios.
Figure 13 illustrates a positive correlation between increasing concentrations of anti-His antibodies conjugated with HRP and OD630. This indicates that the BELO system can effectively express the predicted peptide 2BOU-3 with functional capabilities and detect it concentration-dependent, thereby validating its potential as a biosensing tool for this biomarker.
Learn
In this cycle, we verified that our devices effectively capture CD97-His, strengthening our confidence in their potential as a rapid detection method for Acute Myeloid Leukemia when applied on the chip. This may provide an additional assessment tool to support diagnosis.
Production and fluorescence-activated cell sorting of Escherichiacoli expressing a functional antibody fragment on theexternal surface https://doi.org/10.1073/pnas.90.22.10444
Stacey M, Yona S (2011). Adhesion-GPCRs: Structure to Function (Advances in Experimental Medicine and Biology). Berlin: Springer. ISBN 978-1-4419-7912-4.
Langenhan T, Aust G, Hamann J (May 2013). "Sticky signaling--adhesion class G protein-coupled receptors take the stage". Science Signaling. 6 (276): re3. doi:10.1126/scisignal.2003825. PMID 23695165. S2CID 6958640.