Engineering


Overview

LuxR and pLux are well-established transcriptional regulatory proteins and their corresponding promoters. In bacteria, once LuxR binds with AHL[1], the resulting complex associates with the pLux promoter region, thereby initiating the expression of downstream genes. The concentration of AHL is related to the bacterial population density; when it exceeds a certain threshold, the LuxR-AHL complex activates gene expression. pLux[2] is the promoter regulated by LuxR protein, responsible for controlling the expression of downstream genes. When LuxR binds with AHL, the LuxR-AHL complex binds to the pLux promoter, initiating transcription of downstream genes. The pLux promoter regulates the expression of the luxICDABE gene cluster in natural bacterial systems, which is responsible for the production of luminescent proteins.

LuxR-pLux is a classic model of quorum sensing, which regulates gene expression by detecting the concentration of AHL in the environment. This system can modulate various behaviors based on bacterial population density, such as bioluminescence, toxin production, or biofilm formation. This year, we aim to utilize this mature regulatory system to detect spoilage in milk products. As milk spoils, the bacterial concentration within increases[3]. As bacterial concentration increases, the detection of AHL also increases, allowing bacteria to gauge the presence of other bacteria nearby based on AHL concentration. Since AHL is a lipophilic molecule, it can freely diffuse between bacteria, indicating a quantitative relationship between the number of bacteria in space and the detectable AHL quantity. Therefore, we can detect the corresponding bacterial quantity by measuring AHL.

Based on this design, we conducted system validation and regulation.


Cycle 1: Standard Sample Testing

Design

Through literature research, we established the interaction relationship among LuxR, pLux, and AHL. By recombining a fluorescent protein-encoding gene downstream of pLux, we can detect the AHL small molecule concentration in the system. Traditionally, these genes are encoded on a single plasmid; however, we distributed them across two plasmids, as illustrated in the gene circuit diagram:

Plasmid diagram
Fig. 1 (a) Schematic map of Transcriptional unit 1 : utilizes a strong promoter, J23119, to initiate the transcription and translation of LuxR. (b) The LuxR bound with the signal molecule will reduce the binding efficiency with Plux (Transcriptional 2), thereby initiating the transcription and translation downstream of Plux

Build

We obtained the plasmid through total synthesis, and after that, transformed it into DH5α. Three single colonies were selected for sequencing by Sangon Biotech, and the verified bacterial liquid was preserved in 20% glycerol at -80°C for future use.


Test

From the constructed engineering strain plates, we selected three single clones for overnight cultivation. The next day, we diluted the overnight culture to logarithmic phase, added AHL to different final concentrations in the culture system, and incubated overnight in a TECAN plate reader, monitoring bacterial growth and fluorescence expression:

Plasmid diagram
Figure2. Time-course response curves of strains with the LuxR and pLux regulatory system to varying concentrations of AHL (3O-C6-HSL).

We plotted a standard curve correlating the stabilized fluorescence values with the corresponding AHL small molecule concentrations:

Plasmid diagram
Figure 3: Strains with LuxR and pLux regulatory system response curves to varying concentrations of AHL (3O-C6-HSL).

Learn

The results showed that our designed plasmid system can respond to varying AHL molecule concentrations in Escherichia coli, and we successfully generated its standard curve. This experiment demonstrates that our system can effectively respond to AHL standard solutions. Next, we plan to test whether this plasmid system can respond to real milk samples.


Cycle 2: Milk Sample Testing

Design

We purchased two different brands of milk from the market and left them open at room temperature for a week. During this period, we monitored the degree of spoilage using the plate counting method. After confirming bacterial growth, we cultured Escherichia coli containing the reporter system and added spoiled milk to the bacterial suspension to test their fluorescence signals.

Plasmid diagram
Figure4. two expierd milk, the left one is yue xian huo and right one is wens dairy

Build

1. We transformed the above plasmids into DH5α, plated them on agar plates with the corresponding antibiotics, and obtained single clones.

2. We provided sensory observations from three team members (table1), recorded on days 5 to 7 after purchase (3 to 5 days post-expiration). For safety reasons, we did not taste spoiled milk, but it was interesting to note differences in the physical properties of spoiled milk from different sources.

Plasmid diagram
Table1. the sensory observation of 2 expired milk during 3 to 5 days post-expiration

The colony counting results for the expired milk samples showed significant differences between the two sources. For Wens Dairy (Milk2), the 10⁶ diluted plate yielded 15 colonies, while the 10⁵ diluted plate resulted in 124 colonies. In contrast, Yue Xian Huo (Milk1) showed only 5 colonies on the 10⁵ diluted plate and 20 colonies on the 10⁴ diluted plate. These results indicate a much higher bacterial load in the Wens Dairy sample compared to the Yue Xian Huo sample, suggesting that the milk from Wens Dairy underwent more rapid spoilage or had a higher initial contamination level. This also partly explains why the sensory observations regarding the smell differed between the samples. The higher bacterial load in the Wens Dairy sample likely contributed to more pronounced spoilage, leading to stronger odor changes, whereas the lower colony count in the Yue Xian Huo sample might correspond to less severe spoilage, resulting in subtler changes in smell. These results highlight the correlation between bacterial contamination levels and sensory indicators of spoilage, further emphasizing the importance of microbial load in determining milk freshness and spoilage characteristics.

Plasmid diagram
Figure5. plate of milk2 on 5 days post-expiration

Test

We picked three single clones from the constructed engineering strain plates and cultured them overnight. The next day, we diluted the overnight culture to logarithmic phase and added the spoiled milk samples to the culture system. We then incubated overnight in a TECAN plate reader, monitoring bacterial growth and fluorescence expression.

Measure fluorescence and OD throughout the process:

Plasmid diagram
Figure 6. Response Curves and Quantitative Analysis of Strains with LuxR and pLux Regulatory System to Milk Samples. a. Time-course Response Curves of Strains with LuxR and pLux Regulatory System to Two Spoiled Milk Samples and Control; b. Quantitative Analysis Comparing the Significant Differences in Response between Strains with LuxR and pLux Regulatory System to Spoiled Milk and Control.

Learn

By detecting fluorescence signals, we obtained the expression levels of this reporter system in the culture media of spoiled milk from two different brands. The results showed that the experimental group with spoiled milk had significantly higher fluorescence than the control group, indicating that our reporter system can report not only the concentration of standard AHL but also the concentration of AHL in complex compositions. By comparing the fluorescence intensity with the experiments using standard AHL, we determined that the concentration of AHL molecules in the spoiled milk is approximately between 5-10 nM. Thus, we can establish a relationship among fluorescence intensity, AHL concentration, and milk safety.


Cycle 3: High-Throughput Screening (Model Refinement)

Design

After successfully validating the effectiveness of this system, we decided to implement further engineering controls. Reflecting on previous research resources, we realized that iGEM has provided us with a wealth of engineering data , including a substantial amount of promoter activity data. Based on these resources, we initially selected the promoter currently used in the system. Therefore, we plan to select multiple promoters from the iGEM Part Registry to replace the existing J23119 promoter and explore the impact of different promoters on the system's performance[5].

To enhance the sensitivity of our spoiled milk detection system, we aim to lower the AHL concentration detection threshold to around 5 nM. Currently, our system has a half-maximal induction concentration of approximately 11 nM, necessitating optimization of the detection pathway. Methods to modify an induction-based transcriptional detection system include adjusting the concentration of the sensor protein, altering the protein-promoter relationship, and mutating the sensor protein. Among these, adjusting the sensor protein concentration is a relatively straightforward approach. We referenced a study that specifically explored how varying the concentration of LuxR (the sensor protein) affects the final response curve. The study used modeling and experimental methods to reveal that increasing LuxR concentration can significantly reduce the system’s background signal and lower the half-maximal induction curve, providing a new perspective for optimizing our detection system and achieving higher sensitivity and accuracy in practical applications.

Figure. Transfer function for five different φ according to the simplified model without leakiness. Upper dotted line corresponds to the maximum achievable expression. Lower dotted line shows the Gappₘ/2 of φ=1000 and 100 . The intersections with the transfer functions (little open circles) highlight the Gappₘ/2 for the four transfer functions. Their projections over the [L] axis show their respective Kapp values (bigger open circles). Parameters for the simulation were: gkmL[PT] = 5.0 ,K1 = 1.0,K3*r = 5.0

Build

We utilized the Golden Gate Assembly[6] technology to insert these different promoters into the system and conducted tests to analyze their regulatory performance under various conditions. The iGEM promoter library with Golden Gate standard interfaces was purchased from Alilus Biology. We performed chemical transformation to introduce the Golden Gate products into DH5α competent cells, then plated the transformed bacteria on agar plates with the corresponding antibiotics to complete high-throughput library construction.


Test

We plated the transformed strains on agar plates and obtained multiple single clones after incubation. We then selected these clones to culture in liquid media overnight and tested the fluorescence signals of different promoters at varying AHL concentrations using the same method. Through these experiments, we recorded the maximum response signals of each promoter at different AHL concentrations, providing valuable data to support the subsequent optimization of the system.

Figure 7: Maximum Response Signals of LuxR-pLux Genetic Circuits with Different Promoters at Various AHL Concentrations.

We extracted the maximum fluorescence signals and compared the experimental groups of 0 mM and 50 mM AHL. At a concentration of 50 mM, the induction strengths corresponding to different promoters varied, but all were significantly higher than the control group at 0 mM:

Figure 8. Significant Differences in Maximum Response Signals of the LuxR-pLux Gene Circuit with Different Promoters between 0 mM and 50 mM AHL Experimental Groups

Similarly, we also applied this system to the spoiled milk samples we purchased, and the results were consistent; different promoters exhibited varying responses.

Figure 9. Significant differences in the responses of the LuxR-pLux gene circuit with various promoters to spoiled milk versus the control group.

Learn

After sequencing, we obtained the promoter sequences for these experimental groups and explored the relationship between different sequences and response multiples. Furthermore, we plan to use machine learning methods for modeling and prediction to facilitate further optimization. Since the lengths of the promoters vary, we first extracted statistical features that do not depend on sequence length for subsequent training, which include GC content, sequence complexity entropy, and 3-mer frequency distribution.

sequence GC_Content Sequence_Length Entropy AAA AAC AAG AAT ACA ACC ACG ACT AGA AGC AGG AGT ATA ATC ATG ATT CAA CAC CAG CAT CCA CCC CCG CCT CGA CGC CGG CGT CTA CTC CTG CTT GAA GAC GAG GAT GCA GCC GCG GCT GGA GGC GGG GGT GTA GTC GTG GTT TAA TAC TAG TAT TCA TCC TCG TCT TGA TGC TGG TGT TTA TTC TTG TTT
tttacagctagctcagtcctaggtattatgctagc 0.428571429 35 1.967253461 0 0 0 0 1 0 0 0 0 3 1 1 0 0 1 1 0 0 2 0 0 0 0 1 0 0 0 0 3 1 0 0 0 0 0 0 0 0 0 3 0 0 0 1 1 1 0 0 0 1 3 2 1 1 0 0 0 1 0 0 2 0 0 1
ttgacagctagctcagtcctaggtactgtgctagc 0.514285714 35 1.988442576 0 0 0 0 1 0 0 1 0 3 1 1 0 0 0 0 0 0 2 0 0 0 0 1 0 0 0 0 3 1 1 0 0 1 0 0 0 0 0 3 0 0 0 1 1 1 1 0 0 1 3 0 1 1 0 0 1 1 0 1 0 0 1 0
ctgatagctagctcagtcctagggattatgctagc 0.485714286 35 1.993608561 0 0 0 0 0 0 0 0 0 3 1 1 1 0 1 1 0 0 1 0 0 0 0 1 0 0 0 0 3 1 1 0 0 0 0 2 0 0 0 3 1 0 1 0 0 1 0 0 0 0 4 1 1 1 0 0 1 1 0 0 1 0 0 0
ttgacagctagctcagtcctaggtattgtgctagc 0.485714286 35 1.979724238 0 0 0 0 1 0 0 0 0 3 1 1 0 0 0 1 0 0 2 0 0 0 0 1 0 0 0 0 3 1 0 0 0 1 0 0 0 0 0 3 0 0 0 1 1 1 1 0 0 0 3 1 1 1 0 0 1 1 0 1 0 0 2 0
tttacggctagctcagtcctaggtactatgctagc 0.485714286 35 1.979724238 0 0 0 0 0 0 1 1 0 2 1 1 0 0 1 0 0 0 1 0 0 0 0 1 0 0 1 0 4 1 0 0 0 0 0 0 0 0 0 3 0 1 0 1 1 1 0 0 0 2 3 1 1 1 0 0 0 1 0 0 1 0 0 1
ttgacggctagctcagtcctaggtattgtgctagc 0.514285714 35 1.964060052 0 0 0 0 0 0 1 0 0 2 1 1 0 0 0 1 0 0 1 0 0 0 0 1 0 0 1 0 3 1 0 0 0 1 0 0 0 0 0 3 0 1 0 1 1 1 1 0 0 0 3 1 1 1 0 0 1 1 0 1 0 0 2 0
tttacggctagctcagccctaggtattatgctagc 0.485714286 35 1.979724238 0 0 0 0 0 0 1 0 0 3 1 0 0 0 1 1 0 0 1 0 0 1 0 1 0 0 1 0 3 1 0 0 0 0 0 0 0 1 0 3 0 1 0 1 1 0 0 0 0 1 3 2 1 0 0 0 0 1 0 0 2 0 0 1
taatacgactcactatagggaga 0.391304348 23 1.925796479 0 0 0 1 0 0 1 2 1 0 1 0 2 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 1 1 0 0 0 1 1 0 0 0 0 0 1 0 1 0 0 0 0 0 1 1 1 1 1 0 0 0 0 0 0 0 0 0 0 0
tttacagctagctcagtcctagggactgtgctagc 0.514285714 35 1.988442576 0 0 0 0 1 0 0 1 0 3 1 1 0 0 0 0 0 0 2 0 0 0 0 1 0 0 0 0 3 1 1 0 0 1 0 0 0 0 0 3 1 0 1 0 0 1 1 0 0 1 3 0 1 1 0 0 0 1 0 1 1 0 0 1
tttacggctagctcagtcctaggtacaatgctagc 0.485714286 35 1.993608561 0 0 0 1 1 0 1 0 0 2 1 1 0 0 1 0 1 0 1 0 0 0 0 1 0 0 1 0 3 1 0 0 0 0 0 0 0 0 0 3 0 1 0 1 1 1 0 0 0 2 3 0 1 1 0 0 0 1 0 0 1 0 0 1
ttgacggctagctcagtcctaggtatagtgctagc 0.514285714 35 1.983853121 0 0 0 0 0 0 1 0 0 2 1 2 1 0 0 0 0 0 1 0 0 0 0 1 0 0 1 0 3 1 0 0 0 1 0 0 0 0 0 3 0 1 0 1 1 1 1 0 0 0 4 1 1 1 0 0 1 1 0 0 0 0 1 0
ctgatagctagctcagtcctagggattatgctagc.1 0.485714286 35 1.993608561 0 0 0 0 0 0 0 0 0 3 1 1 1 0 1 1 0 0 1 0 0 0 0 1 0 0 0 0 3 1 1 0 0 0 0 2 0 0 0 3 1 0 1 0 0 1 0 0 0 0 4 1 1 1 0 0 1 1 0 0 1 0 0 0
ctgatggctagctcagtcctagggattatgctagc 0.514285714 35 1.983853121 0 0 0 0 0 0 0 0 0 2 1 1 0 0 2 1 0 0 1 0 0 0 0 1 0 0 0 0 3 1 1 0 0 0 0 2 0 0 0 3 1 1 1 0 0 1 0 0 0 0 3 1 1 1 0 0 1 1 1 0 1 0 0 0
tttatggctagctcagtcctaggtacaatgctagc 0.457142857 35 1.984890223 0 0 0 1 1 0 0 0 0 2 1 1 0 0 2 0 1 0 1 0 0 0 0 1 0 0 0 0 3 1 0 0 0 0 0 0 0 0 0 3 0 1 0 1 1 1 0 0 0 1 3 1 1 1 0 0 0 1 1 0 1 0 0 1
tttaattatatatatatatatataatggaagcgtttt 0.135135135 37 1.524004205 0 0 1 2 0 0 0 0 0 1 0 0 8 0 1 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 1 0 1 0 0 0 0 0 0 1 2 0 0 8 0 0 0 0 0 0 1 0 2 0 0 3
tataagatcatacgccgttatacgttgtttacgctttg 0.368421053 38 1.920673491 0 0 1 0 0 0 3 0 1 0 0 0 3 1 0 0 0 0 0 1 0 0 1 0 0 2 0 2 0 0 0 1 0 0 0 1 0 1 0 1 0 0 0 0 0 0 0 3 1 3 0 2 1 0 0 0 0 0 0 1 2 0 2 2


  1. We compared three methods: Random Forest (RF), Radial Basis Function Kernel SVM (SVMR), and Linear Kernel SVM (SVMl), using 10-fold cross-validation. The average root mean square error (RMSE) for the Random Forest (RF) model was 4.759, and the average R-squared value was 0.383324, indicating that the RF model had a low explanatory power for the target variable. For the Radial Basis Function Kernel SVM (SVMR), the average RMSE was 4.456, and the average R-squared value was 0.3447. The Linear Kernel SVM (SVMl) had an average RMSE of 6.048 and an average R-squared value of 0.3873. Based on multiple sampling evaluations, we observed that all three models had relatively high RMSE values, indicating large prediction errors. The average R-squared values were all below 0.5, suggesting poor explanatory power. Although the Random Forest (RF) model outperformed the Radial Basis Function Kernel SVM (SVMR) and the Linear Kernel SVM (SVMl) across all metrics, it still failed to provide meaningful predictions.
  2. Therefore, we speculated that too much information might have been lost during the feature extraction step. To increase the amount of information input into the model, we increased the number of k-mers and calculated the frequency of 4-mers, then retrained the three models mentioned above. The Random Forest (RF) model yielded an average root mean square error (RMSE) of 3.535 and an average R-squared value of 0.7652. The Radial Basis Function Kernel SVM (SVMR) had an average RMSE of 2.824 and an average R-squared value of 0.44306. The Linear Kernel SVM (SVMl) showed an average RMSE of 5.464 and an average R-squared value of 0.2687327. It can be seen that the RMSE values in the Random Forest and SVMR models decreased significantly, and the average R-squared values increased, indicating that the addition of 4-mer information improved both the prediction accuracy and explanatory power of the models.
Figure10. Boxplot of prediction performance with different features, with 4-mers showed the training features contained 4-mers motif frequence. Error bar refers to std.

References
[1] Tsai, C.S. and Winans, S.C. (2010) 'LuxR-type quorum-sensing regulators that are detached from common scents', Molecular Microbiology, 77(5), pp. 1072-1082. doi: 10.1111/j.1365-2958.2010.07279.x.

[2] Scott, S.R. and Hasty, J. (2016) 'Quorum sensing communication modules for microbial consortia', ACS Synthetic Biology, 5(9), pp. 969-977. doi: 10.1021/acssynbio.5b00286.

[3] United States Food and Drug Administration (2008) 'Foodborne Illness-Causing Organisms in the U.S.—What You Need to Know'.

[4] He, H. and Gao, F. (2022) 'Freshness of milk should not become a health risk for the public', Procuratorial Daily, 26 May, p. 7. doi: 10.28407/n.cnki.njcrb.2022.002302.

[5] iGEM Berkeley Team (2006) Project Overview.

[6] Ghosh, S. et al. (2022) 'Synthetic microbial communities: quorum sensing modules for robust control', ACS Synthetic Biology.

[7] Liu, B., Zhang, H. and Wang, J. (2020) 'Radial basis function kernel optimization for support vector machines', arXiv preprint arXiv:2007.08233.