Goal

Our project is focused on finding the best promoter for each molecule that we were detecting to create a starting point for future biosensors, so that scientists have a consistent and accurate way to measure the results of their research. Of the combinations we studied, we noticed that three promoters—gadB, ybcK, and rof—responded with high levels of fluorescence when interacting with BHL CND, and DEP, respectively. To further explore the mechanisms that caused higher levels of fluorescence as concentrations increased, we looked at transcription factors and proteins that affect these promoters. By looking at these factors, we determined what factors would cause certain promoters to work better with the molecules than others

Overview

In order to determine what factors would cause certain promoters to work better, modeling was used to visualize the interactions between these factors and the promoters. We also used modeling to visualize the interactions between the transcription factors and the protein created, so we looked at two proteins, gadX and gadE, that seemed to have an effect on the promoter gadB. The association of gadX on gadB increased transcription levels under stress, and gadE was found to play an important role in transcriptional regulation in E.Coli. GadE and gadX both activate several genes, one of which being gadB. As a result, we docked gadX and gadE to gadB, respectively. In addition to this, we modeled the interactions between gadB, ybcK, and rof to transcriptional factors found through EcoCyc [1]: YdeO, ybcK, and yidZ, respectively. EcoCyc is a bioinformatics database that provides information about the genome of E.Coli. We aimed to model the mechanisms between these transcriptional factors and the promoters to gain a deeper understanding into how they work.

Docking Procedure

We had two separate docking processes. The first was to visualize DNA-protein interactions, where the protein was a transcription factor that affects the promoter(DNA). The second was to visualize protein-protein interactions, to see how the resulting protein and the transcription factor interact after transcription and translation have taken place and the protein has been created.

For the DNA-protein interactions, we focused on 3 promoters: ybcK, gadB, and rof, as those had high titration levels with BHL in our research. We first looked at how these promoters interacted with other genes on EcoCyc to find transcriptional factors that bind to the promoters. Then, we inputted our promoter’s nucleotide sequence from EcoCyc and our transcriptional factor’s protein sequence into a tool called AlphaFold [2]. AlphaFold created predicted complexes, which we visualized in PyMol [3]. The complete nucleotide sequences of the promoters were very long, so we focused on the sequence where the protein attached to the DNA, which resulted in a shorter, cut fragment of DNA that illustrates the binding more clearly.

For the protein-protein interactions, we first acquired PDB files of gadE, gadX, and gadB from the RCSB protein data bank or Uniprot [8-11]. Two independent docking jobs were then submitted to ClusPro [4-7], an online docking platform, the first with gadE as the ligand and gadB as the receptor, and then with gadX as the ligand and gadB as the receptor. A prediction folder for 10 possible structures was exported.

The docking was all visualized on PyMol, visualizing the sites where the ligand and the receptor would bind together. We also collected some of the important residues of the binding, which can be pictured in some of the images below, where they have been labeled.

Structural Observations

The figures shows transcription factors forming strong interactions with promoters. The binding appears to occur in regions where the AlphaFold model has high prediction confidence, suggesting a functionally significant interface.

Figure 1. Docking of gadE to gadB, where gadE is green and gadB is cyan.


These molecules have a strong interaction, interacting at points like E298, D86, and more, showing how they function together. GadE and gadB have close connections and they bind together in areas where the Alphafold had high prediction confidence. GadX and gadE are transcriptional factors that regulate the expression of gadB. These molecules are part of the glutamate-dependent acid resistance, or GDAR, system in E.Coli. [12] They control responses to acid stress and stress response. As pollution increases, it also causes higher levels of ocean acidification and increases the frequency of acid rain, meaning that it’s even more crucial that we understand these processes. Transcriptional factors like gadE and gadX play a huge role in binding/recruiting RNA polymerase to promoters, so understanding how this binding works can show us how the reactions happen.

Figure 2. Summary of transcription factors on gadB; of those, YdeO is a transciptional regulator which activates transcription initiation for gadB.


Figure 3. Transcriptional factor YdeO(green) binding to GadB.


Enterobacteria like E.Coli exist in the gut of warm-blooded animals, so the bacteria experience various stresses and require complex stress response systems to survive in a continuously changing environment [13]. This includes acidic conditions like that found in the stomach, and anaerobic conditions like that found in the intestines. The most effective system of acid resistance is the GAD pathway, which includes gadB as a glutamate decarboxylase isozyme. YdeO is also a part of this pathway as an activator that activates transcription of gadB. YdeO is a part of the signaling cascade, where EvgA activates YdeO, which then activates gadB. By predicting the interaction between YdeO, scientists can gain a deeper understanding of how bacteria like E.Coli react to acidic conditions, and the mechanisms in their GAD acid-resistance systems. This can also allow scientists to alter the molecules we studied to create an improved biosensor, predict other pathway interactions of the promoter, and identify other molecules that the promoter may work well with.

Figure 4. DNA-binding dual regulator Nac binding to ybcK. Nac activates transcription initiation for ybcK. It’s seen in the pathway of Regulatory Influences on ybcK, which were found using EcoCyc.


Figure 5. Transcriptional factor Nac binding to ybcK; blue is modeling transcriptional factor Nac and the orange represents the gene sequence of ybcK.


The ybcK promoter is found in the DLP12 (defective Lambda prophage) family, which promotes bacterial survival and plays important roles in stress response and cell wall maintenance [14]. With the overexpression of ybcK, the tolerance of sabinene stress increased significantly, implying that ybcK is associated with responding to stress. Sabinene is a chemical that has been found in engineered molecules, but is limited due to low tolerance in eukaryotes and prokaryotes. Nac is a transcriptional factor that regulates the transcriptional reprogramming associated with plant stress response [15]. Nac controls biotic and abiotic stress tolerance, and overexpression of Nac through biotechnological approaches can improve stress tolerance. Nac regulates ybcK by activating it, increasing stress response and tolerance. By modeling Nac and ybcK, more knowledge about stress response can be gained, and can provide more insight on immune response in E.Coli as well as plants, aiding scientists in development of stronger, more disease-resistant produce. This can also be helpful when studying pollution and contaminants, and by making biosensors with these molecules, scientists can monitor pollution levels and disease in plants.

Figure 6. DNA-binding regulator YidZ, which inhibits transcription transcription initiation, binding to promoter rof.


Figure 6. DNA-binding regulator YidZ, which inhibits transcription transcription initiation, binding to promoter rof.


The rof promoter regulates transcriptional termination factor Rho, which is essential for preventing transcriptional error by terminating transcription at the correct points [16]. Rof’s role is important for bacterial gene regulation, like E.Coli, and regulates genes necessary for virulence and host invasion. It also plays a role in bacterial adaptation and pathogenicity. YidZ regulates genes that are associated with response to anaerobic conditions and high nitric oxide levels [17]. It regulates environmental and stress responses, which are critical in bacterial stress survival mechanisms. YidZ is activated when under stressed conditions, and activates Rof, which ensures that the genes for responding to environmental stress are transcribed correctly in order to properly respond to the stress. As YidZ regulates stressful responses, this pairing could be used as a biosensor to monitor stress levels in an environment. In addition, nitric oxide is released from human activity, like combustion of fossil fuels, and is also a key molecule in producing acid rain; therefore, YidZ and rof can be used to monitor pollution levels.

Conclusion

Overall, we identified transcription factors that would work together to increase promoter efficiency—gadE and gadX, and three key promoters—gadB, ybcK, and rof—and transcription factors that would affect the promoters. We chose promoters that exhibited significant fluorescence levels in response to the molecule BHL, providing valuable insight for the development of future biosensors. By using advanced modeling and docking techniques, we visualized the interactions between transcription factors, or protein-protein interactions, such as gadE and gadX, which are crucial in the glutamate-dependent acid resistance system of E. coli. The structural observations made during the docking process highlight how gadE and gadX interact with gadB. Then, we looked at YdeO and gadB, Nac and ybcK, and YidZ and rof, to visualize promoter-transcription factor interactions, or DNA-protein interactions. We visualized how the transcription factors activated or inhibited the promoter. These insights into the underlying mechanisms of promoter-protein interactions could pave the way for optimized biosensors, offering more accurate detection capabilities. Our modeling suggests that the binding of transcriptional factors to promoters plays a significant role in regulating gene expression. Understanding these processes not only advances synthetic biology but also contributes to the broader goal of addressing environmental challenges. Future research can build on this foundation, potentially identifying additional molecules and pathways to refine biosensor design and applications.

References

[1] Keseler et al., Nuc Acids Res, 39:D583–90 (2011);
[2] Jumper, J. et al. “Highly accurate protein structure prediction with AlphaFold.” Nature, 596, pages 583–589 (2021). DOI: 10.1038/s41586-021-03819-2;
[3] The PyMOL Molecular Graphics System, Version 1.2r3pre, Schrodinger, LLC;
[4] Desta IT, Porter KA, Xia B, Kozakov D, Vajda S. Performance and Its Limits in Rigid Body Protein-Protein Docking. Structure. 2020 Sep; 28 (9):1071-1081. DOI: doi;
[5] Vajda S, Yueh C, Beglov D, Bohnuud T, Mottarella SE, Xia B, Hall DR, Kozakov D. New additions to the ClusPro server motivated by CAPRI. Proteins: Structure, Function, and Bioinformatics. 2017 Mar; 85(3):435-444. pdf;
[6] Kozakov D, Hall DR, Xia B, Porter KA, Padhorny D, Yueh C, Beglov D, Vajda S. The ClusPro web server for protein-protein docking. Nature Protocols. 2017 Feb;12(2):255-278. pdf;
[7] Kozakov D, Beglov D, Bohnuud T, Mottarella S, Xia B, Hall DR, Vajda S. How good is automated protein docking? Proteins: Structure, Function, and Bioinformatics. 2013 Dec; 81(12):2159-66. pdf;
[8] Chang, C., Mack, J., Clancy, S., Joachimiak, A., Midwest Center for Structural Genomics (MCSG). Crystal structure of DNA-binding transcriptional dual regulator from Escherichia coli K-12 (2010). 10.2210/pdb3MKL/pdb;
[9] Capitani, G., De Biase, D., Aurizi, C., Gut, H., Bossa, F., Grutter, M.G. Crystal structure of Escherichia coli GadB (neutral pH) (2004). 10.2210/pdb1PMO/pdb;
[10] Jumper, J et al. Highly accurate protein structure prediction with AlphaFold. Nature (2021). 10.1038/s41586-021-03819-2;
[11] Varadi, M et al. AlphaFold Protein Structure Database in 2024: providing structure coverage for over 214 million protein sequences. Nucleic Acids Research (2024). 10.1093/nar/gkad1011;
[12] Seo, S., Kim, D., O’Brien, E. et al. Decoding genome-wide GadEWX-transcriptional regulatory networks reveals multifaceted cellular responses to acid stress in Escherichia coli. Nat Commun 6, 7970 (2015). 10.1038/ncomms8970;
[13] Yamanaka Y, Oshima T, Ishihama A, Yamamoto K. Characterization of the YdeO regulon in Escherichia coli. PLoS One. 2014 Nov 6;9(11):e111962. DOI: 10.1371/journal.pone.0111962;
[14] Wu, T., Liu, J., Li, M. et al. Improvement of sabinene tolerance of Escherichia coli using adaptive laboratory evolution and omics technologies. Biotechnol Biofuels 13, 79 (2020). 10.1186/s13068-020-01715-x;
[15] Nuruzzaman M, Sharoni AM, and Kikuchi S (2013). Roles of NAC transcription factors in the regulation of biotic and abiotic stress responses in plants. Front. Microbiol. 4:248. DOI: 10.3389/fmicb.2013.00248;
[16] Villa, T.G., Abril, A.G. & Sánchez-Pérez, A. Mastering the control of the Rho transcription factor for biotechnological applications. Appl Microbiol Biotechnol 105, 4053–4071 (2021). 10.1007/s00253-021-11326-7;
[17] Ye Gao, Hyun Gyu Lim, Hans Verkler, et al. Unraveling the functions of uncharacterized transcription factors in Escherichia coli using ChIP-exo. Nucleic Acids Research, Volume 49, Issue 17, 27 September 2021, Pages 9696–9710. 10.1093/nar/gkab735.