Model

Scroll To Explore

Overview

Our project focuses on the development of an innovative Bio-Beacon Rescue Locator, aiming to significantly enhance the efficiency and safety of search and rescue operations in marine and space environments. To achieve this goal, we have constructed a series of interconnected and highly integrated model systems, encompassing a disaster prediction model, a marine luminescent life-saving model, and a space application evaluation model. These models jointly visualize the dynamic expression of genetic circuits and explore their practical application values. A brief overview of the model framework is depicted in Figure 1.

Figure. 1 | Basic Framework of Model

Firstly, we systematically reviewed the data on marine and air disasters globally from 2021 to 2023. Utilizing time-series analysis techniques and density regression models, we predicted potential high-risk periods and regions for accidents in 2024. This forward-looking analysis not only enhanced the authenticity and urgency of subsequent simulation scenarios but also ensured that our solutions could flexibly respond to diverse rescue needs.

Subsequently, to accurately portray the macro-trends of luminescent biological systems in response to environmental changes, we constructed a refined population growth model. This model delved into the growth characteristics and dynamic equilibria of key organisms such as Vibrio Natriegens and Synechococcus, laying a solid foundation for understanding and optimizing their luminescent properties. On this basis, we further refined the gene circuit expression model, incorporating critical biological factors such as photosynthetic efficiency and promoter activation mechanisms. By consulting literature, utilizing VeloVI framework predictions, and conducting sensitivity analyses, we set scientifically reasonable parameters and validated the robustness of the model.The study further utilized the Monte Carlo simulation algorithm to simulate randomized helicopter maritime search and rescue scenarios, with time as a variable. It compared the success rates of conventional rescue operations with those of rescue operations equipped with a bioluminescent genetic system.

To fully exploit the luminescent potential of the Bio-Beacon Rescue Locator, we introduced genetic algorithms for optimizing model parameters. This process significantly improved the system's response speed and sensitivity to light changes, providing a theoretical basis and practical guidance for the formulation of subsequent control strategies.

Lastly, with an eye towards the future, we constructed a space application evaluation model. This model comprehensively considered the ecological environments, lighting conditions, and potential challenges of different planets, comprehensively assessing the applicability of the Bio-Beacon Rescue Locator in space exploration and rescue missions. This forward-thinking analysis not only broadened the product's application domains but also provided new directions for its future commercialization and international cooperation.

Model Hypothesis

a.Disaster Prediction Model

1.We suppose that collected data about shipwreck and air crash can represent the real situation around the whole world.

2.When analyzing the periods and regions with high incidence of accidents, we mainly focus on major factors such as seasonality, temporal distribution and geographical location while the detailed effects of other secondary factors (such as human error, mechanical failure, etc.) are relatively ignored.

b.Luminous Life-Saving Model

1.We suppose that growth characteristics of Vibrio Natriegens(such as specific growth rate, maximum bacterial body number, etc.) are stable under certain conditions and can be determined experimentally.

2.The expression of gene lines (e.g., promoter activation, gene expression, etc.) follows known biochemical principles and can be simulated by mathematical models.

3.The conversion efficiency of Synechococcus photosynthesis and UV intensity change in sine wave within one day.

4.There is no interference from artificial light, and we exclude light pollution or light superposition effect caused by external artificial lighting equipment (such as ships, aircraft, lighthouse or other artificial structures), thus ensuring reasonable assessment of Bio-Beacon Rescue Locator luminous efficiency and its actual effect in search and rescue action natural light conditions (such as day and night change, ultraviolet intensity, etc.) under natural light conditions.

c.Space Feasibility Analysis

1.The planet data collected (such as light conditions, temperature, gravity, etc.) is accurate and reliable, and can represent the true situation of the planet.

2.Selected evaluation indicators can fully reflect the applicability of Bio-Beacon Rescue Locator in space.

3.The luxAB system and its substrates can maintain stable and active under the temperature, light and other conditions of the target planet.

4.Long-term operation of the system will not be seriously affected by the special environment of the planet (such as strong radiation, extreme temperature difference, etc.).

Disaster Prediction Model

When delving into the intricate issues of global shipping, aviation safety, and emergency response strategies, we cannot help but direct our attention to the unprecedented challenges and transformations that the global shipping and aviation industries have undergone in recent years, particularly since the outbreak of the COVID-19 pandemic in 2020. This global public health crisis has not only reshaped the international trade landscape but also profoundly impacted every aspect of maritime and air transportation, encompassing the stability of logistics supply chains and maritime safety regulation, among others. The HP group has conducted a preliminary assessment of the frequency of maritime disasters and the effectiveness of current rescue efforts, emphasizing the criticality of enhancing rescue efficiency and reducing response time. Click here to view the detailed information.

Against this backdrop, conducting an in-depth analysis and research on major maritime and aviation accidents in the past three years (from 2021 to 2023) has become particularly urgent and important. The International Maritime Organization (IMO), as a specialized agency within the United Nations system responsible for maritime safety, navigation efficiency, and the prevention and control of pollution of the marine environment by ships, providing invaluable data resources through its Global Integrated Shipping Information System (GISIS).

(Data Resources: https://gisis.imo.org/Public/MCIR/InvestigationReportsDashboard.aspx)

GISIS not only encompasses detailed information on the global fleet but also includes data on maritime accidents and incidents recorded in accordance with rigorous standards (MSC-MEPC.3/Circ.4/Rev.1 Circular). These data play a role in understanding the causes of accidents, identifying risk areas, and predicting future trends. We have collected data on maritime and aviation disasters and compiled partial results as shown in Figure 2.

Figure. 2 | Statistics of Some Accidents

To facilitate the presentation of the geographical density of maritime accidents, a global probability density map was plotted based on GPS coordinates.

Figure. 3 | Geographical Distribution of Accidents

Based on a study of global maritime accident data from 2010 to 2019, using density analysis and cluster analysis, it was found that the North Sea, Baltic Sea, and Mediterranean Sea regions formed clusters of low-severity accidents. Over 60% of accidents occurred within 30 nautical miles from the coastline. Specifically, high-severity accident clusters were identified in the coastal waters surrounding China, Japan, South Korea, Vietnam, and the Philippines, as well as in the Singapore-Malacca Strait and the Bay of Biscay[1]. After scrutinizing our dataset, distinct clusters of accidents emerged in the vicinity of the Koper seaport and within Río Negro province in Argentina. This underscores the vulnerability of regions characterized by narrow waterways and densely populated ports to a higher incidence of accidents. July was identified as the peak month for accidents, and temporal analysis revealed the highest frequency of accidents occurring at night (concentrated between midnight and sunrise). A statistical overview of monthly accident frequency throughout the year is presented in Figure 4. According to the investigation report of the Ningbo Maritime Bureau, 18.54% of accidents were attributed to poor visibility and limited field of view at night, 13.43% to operational errors, and 6.63% to decision-making errors based on misinterpretation. The relatively low traffic volume at night can contribute to distractions, visual fatigue, and reduced response capabilities.

Figure. 4 | Statistics of Accident Frequency in Each Month

Previous researchers have focused on the roles of various factors contributing to accidents and attempted to describe the correlations among them using diverse models. Zhou et al. established a novel database that captures the characteristics of all Risk Influencing Factors (RIFs) and developed a data-driven risk analysis model based on Bayesian Networks (BN) for analyzing casualty situations in maritime accidents[2]. Building upon the acquired data, we utilized the ARIMA model to analyze the temporal characteristics of accidents. The basic idea of ARIMA is that the data sequence formed by the prediction over time is regarded as a random sequence and a model can be used to approximately describe this sequence. Once this sequence is identified, the model can predict future values from past and present values of the time series.

The ARIMA model integrates the auto-regression (AR) and moving average (MA) models to fit and forecast time-series data, specifically aiming to predict the timing of the next five occurrences or events. The AR component characterizes the relationship between the current value and its lagged values, utilizing historical data to project future values. Meanwhile, the MA component leverages the linear combination of past residual terms to estimate future residuals. The comprehensive ARIMA prediction model can be formulated as follows:

(1.1)

where 𝑝 is the order of Autoregressive Model (AR), 𝑞 is the order of Moving Average Model (AM), ɛ{t} is the Error term between time t and t-1, γj and θj are the fitting coefficients, 𝑝0 is constant term.

For the prediction of geographical coordinates, the Random Forest Regression algorithm is employed, leveraging time differences as input features. This approach exploits the historical relationship between time differences and geographical coordinates to forecast future latitudes and longitudes, respectively. The Random Forest algorithm, a supervised machine learning method, is constructed through an ensemble of decision trees as base learners. By introducing randomness into the training process of decision trees, it effectively copes with data noise and complex relationships, thereby enhancing prediction accuracy[3]. In terms of data preprocessing, we perform scaling on timestamps and geographical coordinates to prevent the model from being overly sensitive to large-scale numerical variations.

In accordance with the established disaster prediction model, the subsequent accidents are predicted to mainly occur around 0:00 in the night, with a relatively scattered distribution in terms of longitude and latitude. Therefore, focusing on the predicted high-risk period and locations of maritime accidents by the model, we choose the early morning as the simulated accident occurrence time, and set the GPS coordinates approximately at 25°30.00' N, 62°30.00' E in the northern Arabian Sea (near the coast of Pakistan) as the simulated accident site for subsequent modeling. Founded on this model, we conduct an in-depth analysis of the specific impacts of unique environmental factors during this period on the development of accidents, so as to evaluate the efficiency of the constructed luminous system and effectively improve the survival rate of people in maritime disasters. This analysis closely matches the actual high-risk periods and locations, significantly enhancing the authenticity of the simulated scenarios and ensuring the extensive adaptability and pertinence of our work.

Luminous Life-Saving Model

The main components of the light-emitting system are Synechococcus and Vibrio Natriegens. Synechococcus provides Vibrio Natriegens with necessary nutrients such as sucrose for growth. Considering that population growth is affected by factors such as environmental carrying capacity, we use the Logistics model to simulate the growth of Synechococcus[4]:

(1.2)

In this equation, C1 represents the quantity of Synechococcus, μ1 denotes the specific growth rate of the bacteria, and N1 refers to the maximum tolerable number of bacteria in the culture medium. During the immobilization process, Tian et al. determined that the maximum specific growth rate of Synechococcus in BG-11 medium was 0.74 d-1, with a maximum algal density of 9.8×108 cells/mL[5]. We use differential equations to describe the ability of Synechococcus to produce sucrose:

(1.3)

wherein, s represents the sucrose concentration, and α depicts the photosynthetic conversion efficiency of Synechococcus. Assuming that the sunlight is strongest at noon, we use a sine function (with periodicity) to simulate the variation of α.

(1.4)

α0:Basic photosynthetic efficiency, which is the efficiency when there is no sunlight.

α1:Amplitude of variation, which is the maximum change in efficiency caused by changes in sunlight intensity.

Next, by introducing Vibrio Natriegens and considering the flow of sucrose between the two organisms, the change in sucrose concentration in the system described in Equation (1.3) can be refined through the following equation:

(1.5)

γ1 represents the consumption rate of sucrose by Vibrio Natriegens, which is necessary for its growth, development, and promoter binding. The growth of Vibrio Natriegens directly depends on the availability of sucrose. When the concentration of sucrose reaches a certain threshold Ks, the population begins to grow. We use the Michaelis-Menten kinetic model[6] to describe the utilization of sucrose by Vibrio Natriegens:

(1.6)

where C2 represents the number of Vibrio Natriegens, μ2 denotes its specific growth rate, and k1 is the inhibition coefficient of Reactive Oxygen Species (ROS) on bacterial growth, which will be elaborated later. Through experimental measurements, the doubling time of Vibrio Natriegens in LBv2 liquid medium was found to be 13.89 minutes[7], based on which the parameters were set.

To facilitate the construction of a model for the interactions within the entire system, we will disassemble the main pathways, analyze the internal changes, and then connect them with other pathways.


a.Cold Tolerance and Safety Module

Background: The low-temperature environment of seawater can increase the ROS content in Vibrio Natriegens, which is detrimental to its survival and function. Vibrio Natriegens expresses superoxide dismutase and catalase to decompose ROS into water and oxygen, thereby endowing the engineered bacteria with cold tolerance. Click here to view the detailed design information related to this study.

1.The binding of sucrose to the PsacB promoter

The binding of sucrose to a promoter can be denoted as:

(1.7)

k1 is the binding rate constant, while k-1 represents the dissociation rate constant.

Change of Unbound Promoter Concentration:

(1.8)

Change of Bound Promoter Concentration:

(1.9)

Total Promoter Concentration:

(1.10)

2.Transcription and Translation of SOD Gene and Catalase

Transcription of SOD Gene:

(1.11)

The translation process:

(1.12)

To facilitate subsequent expressions, the processes of transcription and translation can be conceptually integrated:

(1.13)

kSOD denotes the transcription rate constant, T(SOD) base represents the basal transcription level and γSOD stands for the degradation rate constant. In Vibrio natriegens and Escherichia coli[8], the average half-life of mRNA is approximately 3 minutes. The half-life of proteins was obtained through corresponding database queries.

Analogously, for Catalase enzyme expression:

(1.14)

3.The generation and degradation of ROS

The reaction catalyzed by SOD can be expressed as:

(1.15)

Change of ROS:

(1.16)

λ represents the rate of ROS production by a single Vibrio Natriegens, kSOD is the rate constant of reactions that are catalyzed by SOD. Generation of the reaction product H2O2:

(1.17)

Then the catalase breaks down the previous reaction product, hydrogen peroxide, into water and oxygen:

(1.18)

Changes of H2O2:

(1.19)


b.The Substrate Accumulation Module

Background:The Envz-OmpR two-component system can feel the osmotic pressure change. The high osmotic pressure environment of seawater increases the phosphorylation level of OmpR, activates the luxCDEFG expression downstream of the pOmpC promoter. Thus, the luminous substrate begins to accumulate continuously after entering seawater. Click here to view the detailed design information related to this study.

1.Action of the pOmpC Promoter

Assuming that the pOmpC promoter activity is σOmpC(ε), where ε is the osmotic pressure strength. The process can be regarded as a step function, where the promoter is activated when the osmotic pressure exceeds a certain threshold:

(1.20)

β is the slope parameter, εthresh is the threshold of osmotic pressure(The seawater osmotic pressure is about 2.753 kPa). The changes in pOmpC promoter activity are shown in Figure 5.

Figure. 5 | Change of pOmpC Promoter Activity with Osmolarity

2.Expression of LuxCDEFG

(1.21)

LuxCDE provides substrate for luxAB oxidation luminescence, and the substrate accumulation is:

(1.22)

ksub is rate constant for substrate synthesis, γsub is the substrate degradation rate constant.


c.Luminescence Module

Background:The puv promoter is induced by UV, and expresses downstream CI protein repressor luxAB expression when UV is strong during the daytime, only accumulating substrate without luminescence. When UV weakens at night, the repressive effect weakens and luxAB expression begins to glow. Click here to view the detailed design information related to this study.

1.Changes of the UV Intensity

Assuming that UV intensity is sinusoidal throughout the day, reaching the maximum at 12 noon and the minimum at night:

(1.23)

Umax is the maximum of the UV intensity while Umin is the minimum (close to 0). t is time(hour).

2.Activation of the Puv Promoter

(1.24)

The degree of activation of the promoter is a function of UV intensity, here reduced to the saturation function, ω is a small normal number used to avoid the denominator of 0, whose activity varies with UV as shown in Figure 6.

Figure. 6 | Changes of puv promoter activity with UV light

3.Expression of the CI Gene

(1.25)

4.Changes of the Lam Promoter Activity

The activity of the lam promoter is affected by the repressor action of the CI proteins:

(1.26)

kdep is a repressive constant, n is Hill coefficient describing strength of repressive effect.

5.Expression of LuxAB

(1.27)

6.Activation Effect of LuxFG on LuxAB

LuxCDE provide substrate for the luxAB oxidation of luminescence. LuxFG increases the luminescence intensity by further activating the luxAB. To demonstrate activation degree of magnitude, we introduce the variable described as:

(1.28)

Fmax is the maximum activation action constant, Kf is semi-saturation constant.

Take the derivative of it, and observe the change patterns of the variables over time:

(1.29)

Plug it into the Equation(1.21) and get the result:

(1.30)

7.Changes of Illumination

(1.31)

η represents the luminous efficiency coefficient, with the potential influence of other regulatory factors (such as pH, temperature, etc.) on the illuminance being temporarily overlooked. By measuring the luminescence of a fixed number of bacteria, an approximate value of η can be fitted.

The diffusion of light at the sea surface can be characterized by a variant of the Gaussian distribution in two-dimensional space, which describes the variation in intensity emanating from a point source (or central point) at a given time t as a function of the distance from the central point (x0, y0). This model exhibits robust fitting capabilities in depicting numerous natural phenomena (e.g. light propagation, heat conduction) and engineering systems (e.g. signal processing, blurring, and denoising in image processing)[9]. The formula can be expressed as:

(1.32)

I(x,y,t) represents the illuminance distribution at the spatial point (x,y) at time point t. σ denotes the standard deviation of the Gaussian distribution, which determines the rate of attenuation of illuminance with respect to distance.


d.Parameter Estimation

Quantifying the transcription and translation rate constants of individual genes within a model is often challenging, as traditional experimental techniques not only fail to measure general cellular properties across different time points but also lack the capability to assess the speed at which these cellular changes occur. Currently, there exist several prevalent methodologies for estimating RNA velocities, including steady-state models, Expectation-Maximization (EM) models, and deep learning models. Among them, VeloVI (Velocity Inference of Variation) stands out as a deep generative model designed specifically for estimating RNA velocities. It re-infers RNA velocities through a model that shares information across all cells and genes while simultaneously learning the same quantities as in EM models, namely, kinetic parameters and latent times. As its output, VeloVI returns empirical posterior distributions of RNA velocities (posterior samples of gene-cell matrices), which can be incorporated into downstream analyses of the results[10]. Furthermore, it offers an interpretability and model criticism layer lacking in previous methods, while significantly enhancing the flexibility of model extension.

Figure. 7 | Overview of the VeloVI model

(From: Deep generative modeling of transcriptional dynamics for RNA velocity analysis in single cells)

Figure. 8 | Correlated Gene Sequence Data

We have collected relevant gene sequence data (Figure 8) and employed the VeloVI framework integrated within a Python environment to conduct gene-level velocity analysis. The preprocessed and extracted specific gene sequence data were organized according to the format required by the VeloVI framework and served as input data to initiate the model training process. During training, the model learns the intrinsic dynamic properties of the data, ultimately outputting kinetic parameters such as transcription rates for the corresponding genes.


e.Solution and Sensitivity Analysis

Based on the constructed differential equations, parameter values in the model were determined through literature review and other relevant sources. The LSODA algorithm, a robust and versatile tool for solving differential equations, was employed to address both stiff and non-stiff problems.

The results indicated that sucrose concentration peaked during the night, reflecting a diurnal rhythm. The expression levels of luxAB and CI exhibited an antagonistic relationship, with CI expression decreasing at night. Meanwhile, luxCDEFG and substrate concentrations rapidly increased in the initial stages before stabilizing at a constant rate.

Figure. 9 | Changes of the Key Parameters in the Model

Figure 10 depicts the changes in the activity of the lam promoter (σtam), which exhibited a substantial decline around 24 hours, indicative of an enhanced inhibitory effect by CI protein, suppressing downstream gene expression. Approximately 48 hours later, as luxAB expression peaked, promoter activity significantly intensified.

Figure. 10 | Changes of the Lam Promoter Activity

Ultimately, the variations in bioluminescence intensity are presented in Figure 11. Initially, the bioluminescence intensity exhibited minimal fluctuations within the first 20 hours. Subsequently, it rapidly intensified, reaching a peak of 1567 lx at 48.87 hours. As the populations of Vibrio Natriegens and Synechococcus stabilized, the bioluminescence intensity fluctuations gradually diminished, maintaining a steady level of approximately 1000 lx.

Figure. 11 | Changes of the Illumination Degree

The table below outlines the illuminance values under various environmental conditions[11], highlighting the superiority of the gene circuit integrated into the Bio-Beacon Rescue Locator over the traditional lighting technology employed in TV Studios. This comparison not only visually demonstrates its advantages in brightness, durability, and energy efficiency but also underscores the potential of biotechnology in the field of emergency rescue localization. By harnessing the gene circuit, the Bio-Beacon is capable of emitting a brighter and more stable light, ensuring that rescue signals remain clearly visible even under complex and variable environmental conditions. This significantly enhances the efficiency and success rate of search and rescue operations.

In order to present the change of sea surface illumination, the light distribution was simulated with Gaussian Distribution and visualized with MATLAB thermal map tool. It is estimated that the maximum coverage radius can reach about 90 meters on the sea.

In the previous model, parameter η are set to represent the luminescence efficiency coefficient in the expression of illumination I. In actual situations, all kinds of regulatory factors (such as pH, temperature, etc.) can play a role in the luminous effect by affecting the luminous efficiency of the line. To explore the influence of the set parameter η changes on the model results, we conducted a sensitivity analysis and adjusted it up and down by 20% based on the original set value of 30% to simulate the change of external conditions.

Figure. 12 | Results of the Model Sensitivity Analysis

From Figure 12, it is observed that adjusting the values of the key parameter η has some impacts on the model results, but the overall trend does not change. When the value of η is increased, it means that the luminous efficiency of the gene line improves, converting more substrate into optical energy per unit time. However, while and the peak of I increases significantly at this time, more accumulated material is consumed, which results in the decline of the luminous effect after 50 hours. When we decrease the value of η, although the maximum illumination is not as good as that before adjustment, it has stronger stability and less fluctuation after peak, thus more superior in longer rescue. In short, through sensitivity analysis, it is found that our established model has good robustness and does not lead to specific results due to small changes in variables in the actual situation.


f.Monte Carlo simulation

The probability of being rescued or surviving after a person falls into the sea, as a function of time, can be simplified and described using a survival function[12], which is expressed as:

(1.33)

wherein, R0 represents the initial rescue success rate (the success rate of initiating rescue immediately after a person falls into the water), which depends on factors such as the readiness of the rescue team, the availability of equipment, and others. With the development of related technologies, studies have shown that its value can reach approximately 85%[13]. λ is a decay coefficient, indicating the rate of decline in the rescue success rate over time, and it is the weighted sum of multiple influencing factors. Huang et al. studied and concluded that the golden rescue time for a person falling into the water is 12 hours during summer[14]. Therefore, for the value of λ, we describe it using a piecewise function as follows:

(1.34)

Previously, the spreading of light on the sea surface was simulated using a Gaussian distribution. To explore the impact of the constructed bioluminescent genetic system on rescue success rates in practical rescue situations, based on the previous model, a Monte Carlo simulation algorithm was employed to simulate helicopter maritime search and rescue scenarios using a computer. Monte Carlo simulation is a mathematical technique based on random numbers, with its core idea being to approximate the solution to a complex problem through random sampling, which is used to simulate complex systems and solve computational problems, especially those involving multiple variables and a large amount of uncertainty[15]. According to research, helicopters typically conduct search and rescue operations at a relatively low altitude of around 50 meters during maritime search and rescue missions to ensure accurate target detection and effective rescue[16]. Within a 50-meter range, the minimum illuminance that can be recognized by the human eye is approximately 0.2 microlux (μlx)[17]. Therefore, we use this as a threshold and set the search area to 10,000 square meters. The computer will randomly select a sufficient number of sample points at each time interval. Without considering external light sources interfering with the rescue, if the light intensity at a point is greater than 0.2 μlx, it can be considered that the person overboard has the opportunity to be identified by the search and rescue personnel on the helicopter through the bioluminescent system and thus be rescued. Otherwise, they cannot be identified. Based on this idea, we wrote a Python program to compare the rescue success rates under original rescue conditions and those with the bioluminescent genetic system device installed. The results are shown in Figure 13.

Figure. 13 | Comparison of rescue success rates between the two scenarios

It can be observed that under ordinary rescue conditions, the probability of successful rescue approaches almost zero around the 25-hour mark. Maritime search and rescue missions are characterized by long durations and large search areas, and beyond the golden rescue period, the physical regulation capabilities of individuals are largely lost, while search and rescue forces also face mental and material challenges. However, when the Bio-Beacon Rescue Locator is applied in this scenario, the rescue success rate stabilizes above 60% after 50 hours. Considering the growth characteristics of bacterial colonies, the probability of locating survivors through the bioluminescent system continues to increase over time. Therefore, the Bio-Beacon Rescue Locator can facilitate real-world rescue scenarios by synergizing with various technical means to leverage its advantages in mid-to-late stage rescues.

g.Optimization of Results

In order to achieve a more efficient luminescence effect and improve the product application potential, it is necessary to optimize the proposed parameters during the model solution. Considering the complexity of the actual differential equations and scenarios, the genetic algorithm(GA) is selected to optimize the preliminary results.

GA, an optimization algorithm based on the principles of biological evolution, seeks the optimal solution in the search space by simulating natural selection and genetic mechanisms. It consists of three basic operations: selection, crossover, and variation.In the selection operation, the fitness of each individual is evaluated according to the fitness function, and the individual with higher fitness is selected to enter the next generation. The crossover operation simulates the process of genetic recombination, combining the genomes of two individuals into a new individual. The variation operation simulates a stochastic process of gene mutation, which randomly changes an individual's gene. It has the characteristics of adaptability, parallelism and robustness, and is widely used in machine learning, artificial intelligence, optimization problems and other fields[18].The framework of genetic algorithm (GA) is shown in Figure 14.

Figure 14 | Flow Chart of the GA Implementation

With DEAP (Distributed Evolutionary Algorithms in Python), we implemented the solution of GA (The specific code is shown in the appendix). Figure 15 describes the algorithm iteration effect, where the convergence curve shows a gradual stabilization of the numerical values finally close to around 4600.This trend indicates that the algorithm gradually approaches the optimal solution during iteration and reaches a steady state after a smaller number of iterations.

Figure. 15 | Convergence Curves Optimized by Genetic Algorithm

Figure 16 shows the relationship between illumination and other substances after genetic algorithm optimization. Searched optimal I value was 4582.49, achieving an improvement of 192.4% compared to the initial state, reflecting that the configuration in the gene lines can be optimized to significantly improve the response efficiency and regulation ability of the system to the illumination. The significant performance improvement not only proves the powerful potential of genetic algorithms in the optimization of complex systems, but also reveals the advantages of surpassing the traditional design framework by fine-tuning the internal parameters and configuration of the system.

Figure. 16 | Relationships among illuminance, Fact and expression of CI

Space Feasibility Analysis

With the development of the aerospace industry, we hope that the Bio-Beacon Rescue Locator will no longer be limited to lifesaving activities on Earth, but will also be able to move into space, enjoying broader application space. The HP group has initially compared the marine and space environments, aiming to understand the differences between these two extreme environments from multiple dimensions, including radiation intensity, gravity differences, gas environment composition, salinity characteristics (for oceans), humidity conditions, and potential microbial infection risks, thereby better adapting and optimizing the two product designs: life jackets designed for marine environments and luminous space suits tailored to the needs of space exploration. Click here to view the detailed information.

To further advance the application of the Bio-Beacon Rescue Locator in space, we will proceed to systematically collect and analyze data from various planets within the Solar System from multiple angles. This process will focus on evaluating the product's suitability and performance in different planetary environments, aiming to identify its potential application values. Based on recommendations from relevant literature[19-20], we have compiled a series of primary evaluation indicators:

Figure. 17 | Various evaluation metrics within the model

① Lighting Conditions:

External Light Intensity: As luxAB luminescence is regulated by ultraviolet (UV) light, assessing the planet's diurnal cycle and light intensity is crucial to understanding the overall environmental impact on the device's visibility. By collecting data on the planet's rotation and revolution periods, the rotation period reflects its day-night cycle, which influences the rescue locator's luminescence mode during day and night. For instance, on planets with shorter rotation periods, the luxAB system may need to switch from UV inhibition to luminescence within a shorter timeframe. Meanwhile, the revolution period affects the planet's distance from the Sun, thereby influencing light intensity and UV radiation levels.

② Atmospheric Composition and Pressure:

Oxygen Content: The luxAB system requires oxygen for oxidation reactions, making the target planet's oxygen concentration a pivotal factor.

Atmosphere and Atmospheric Pressure: While not directly impacting the luxAB's luminescence mechanism, extreme atmospheric pressures can potentially affect the device's physical integrity and functionality.

③ Temperature:

Average Temperature: Given the vast differences in temperatures across planets, it is necessary to evaluate the stability and activity of the luxAB system and its substrates within the temperature range of the target planet.

Temperature Fluctuations: Diurnal or seasonal temperature variations may impact the long-term operation of the device.

④ Gravity:

Gravity can potentially influence the distribution of internal fluids and reaction rates within the device, although such effects may be relatively minor in the luxAB system, they still need to be considered.

⑤ Supply Conditions:

Closest Distance to Earth: For astronauts executing missions, knowing the closest distance to Earth facilitates optimal mission design, ensuring timely communication with Earth in case of emergencies (e.g. energy shortages), and also reveals interplanetary environmental differences.

Water Resources: The EnvZ-OmpR two-component system is sensitive to osmolarity, affecting the expression of luxCDEFG. Additionally, when astronauts lose contact with the outside world, water resources on the planet become crucial for survival. Therefore, it is essential to understand factors that may affect osmolarity, such as water content on the planet's surface.

Upon reviewing relevant materials, we have collected the corresponding data for these evaluation indicators, as shown in Figure 16. To assess the suitability of the Bio-Beacon Rescue Locator on various planets, we will utilize the Rank-Sum Ratio (RSR) Comprehensive Evaluation Method, which is based on ranking the evaluation indicators and using the average rank as the evaluation criterion. This method enables comprehensive evaluation of indicators with different units of measurement[21].

Figure. 18 | Data of Evaluation Indicators within Each Planet

Firstly, the data was preprocessed. We set some factors such as the atmospheric pressure, and average surface temperature as intermediate indicators, for which the closer data is to the Earth, the more it is suitable for human habitation. In terms of other indicators such as difference in temperature, it would be better if the value is smaller.

Make a linear transformation for the intermediate indicator, making the results fall in [0,1] interval. When the original data is equal to the ideal value, the result is 1; the closer the original data is close to the ideal value, the closer the processed result value is close to 1. The principle can be expressed as:

(1.35)

For the negative indicators, we converted them into positive indicator, and makes the results fall in the [0,1] interval:

(1.36)

The evaluation object n=8, and the evaluation index m=12. Based on it, we construct the matrix(8×2) and code rank for it. In order to improve the deficiency of the whole rank sum ratio method, the nonwhole rank sum ratio method is adopted to have a quantitative linear correspondence between the compiled rank and the original index value, thus overcoming the disadvantage of losing the original index value when the RSR method is rank, and obtaining the rank matrix: R=(Rij)12×8

In order to determine the weight through the information quantity of the data itself and avoid the bias caused by subjective judgment, the entropy weight method is used to calculate the rank sum ratio, which can be expressed as follows:

(1.37)

Rij is the rank of the i object and the j index of the i object, Wj represents the weight of the j indicator, the weight sum is 1. The larger the WRSRi value is, the better the evaluation object is.

The calculation results of entropy weight method show that the maximum weight of the index is whether there is water on the planet (26.645%), and the minimum value is whether there is atmosphere (5.133%), reflecting the small difference between different planets in this index.

The WRSR frequency distribution table, lists the frequency number f of each group , and the cumulative frequency number cf and the cumulative frequency p of each group are calculated, which is converted into the probability unit probit, with the probit value as the independent variable and WRSR as the dependent variable.

From the analysis of the results of F test, it can be obtained that the significance P value is less than 0.001, indicating that the level is significant and the null hypothesis of regression coefficient of 0 is rejected. At the same time, the goodness of fit R² is 0.981, reflecting the excellent performance of the model, so the model basically meets the requirements. For variable collinearity performance, the VIF is all less than 10, so the model does not have multicollinearity problems and the model is well constructed. The formula for the model is below:

(1.38)

Finally, we sort the classification, and the evaluation objects are sorted according to the WRSR estimate calculated by the regression equation. The results are shown in Table 5.


Figure. 19 | WRSR fit values for each planet

The evaluation results indicate that, apart from Earth, Mars exhibits the highest suitability for the Bio-Beacon Rescue Locator due to its minimal differences in rotational period and water resources compared to Earth's environment. Conversely, the product is deemed unsuitable for application during voyages to Neptune, primarily attributed to the lack of sufficient lighting conditions necessary for the activation of the puv promoter and subsequent expression of downstream genes within the system. This finding aligns with information gathered by the HP group through interviews, particularly the exchanges with Liu Liwei, a student specializing in aeronautics, who further confirmed Mars as the preferred target for the application scenarios of our project.

In the future, we aim to incorporate the DNA repair mechanism of Deinococcus radiodurans into the design of engineered bacteria, with the goal of providing astronauts with more comprehensive and reliable life support during missions to Mars and other potential space exploration endeavors.

Molecular Docking Test of Repressor Protein

a.Background

In the anti-leakage module, we designed the active killing circuit for sensing sucrose. The overall circuit is relatively complicated, coupled with the lack of materials in the wet experiment, it is difficult to build the whole circuit in the wet laboratory. The effectiveness of the circuit mainly depends on two repressor proteins, ScrR and lacI, and we decided to use molecular docking technology to simulate the function of the repressor proteins. Among them, lacI is widely used and located downstream, while ScrR plays a more important role in the sucrose sensing part of the circuit, so the part we tested was sucrose receptor, that is, to verify whether the presence of sucrose would inhibit ScrR's repression of Pscr.

In order to show the richness of the experiment, we did not use the conventional wet experiment method, but used the computer to simulate the molecular docking test. This isn't just a good simulation of how sucrose receptors work. It was also an exploratory test that inspired us to study in a more microscopic way, providing directions for future project optimization.

Figure. 20 | Data of evaluation indicators within each planet

b.Steps

1. Obtain sequences of ScrR and Pscr from Biobricks.

2. Download the 3D structure of small Sucrose molecules from Pubchem and obtain the structure file Sucrose.sdf.

3. ScrR obtains the protein file ScrR.pdb after the protein is translated by the sequence, and then downloads 6NDI from RCSB for subsequent docking with sucrose.

4. Use 3dDNAfold to obtain DNA structure file.

5. Autodock tools were used to process the ScrR as single-chain pdbqt file, and global docking with sucrose was carried out (due to the large protein, it was divided into two parts for docking, and the highest rank was taken, that is, the result of the highest affinity) .And then prepare for molecular dynamics simulation.

6. Conducting molecular dynamics simulation for 240ps(DS standard environment, default force field) .

Figure. 21 | Raw output parameters of molecular dynamics simulation

7. Before and after molecular dynamics simulation, HDock was used to dock DNA with protein, 100 models were established for each, and the 10 models with the highest rank (highest affinity) were selected for comparison.

8. Visualized the docking results with pymol and other tools for comparative analysis of the results.

c.Outcome analysis

1. Firstly, ScrR after sucrose treatment was preliminarily observed. Before and after molecular dynamics simulation, the structure of ScrR protein docked with sucrose did change, indicating that the presence of sucrose can lead to structural changes of ScrR repressor protein.

Figure. 22 | ScrR conformational change before and after sucrose docking with MD simulation

2. Next, the docking results of ScrR and Pscr were analyzed, and we found that there was no significant difference in rank value before and after molecular dynamics simulation, which meant that the affinity between ScrR and Pscr did not change significantly before and after sucrose treatment, which made us confused, because the general inhibition of repressor proteins was to remove them from DNA.

Figure. 23 | Affinity comparison between ScrR and Pscr before and after MD simulation

3. We reflected on the affinity results and consulted the data, thinking that the change of binding mode would also inhibit the function of repressor proteins, so we analyzed the relevant data and structure, and found that before MD simulation, DNA and protein cracks were just docked together, and after MD simulation, DNA was forced to move away from proteins due to the reduction of cracks, and only a few sites were bound. Transcription-related enzymes can initiate transcription by binding to parts of the hairpin structure. In addition, according to the data, DNA dissociates during this process, theoretically starting transcription as soon as the dissociation equilibrium occurs.

Figure. 24 | Changes of ScrR and Pscr combination structure before and after MD simulation

4. Out of interest in the change of binding mode, we amplified and compared the docking sites of ScrR and Pscr before and after MD simulation, and listed the changes of binding si-tes. We believe that the number and specific location of binding sites can affect the degree of DNA exposure, thus affecting the activation of Pscr.

Figure. 25 | Binding sites of ScrR and Pscr before MD simulation
Figure. 26 | Binding sites of ScrR and Pscr after MD simulation

d. Summary

This test basically verified the inhibitory effect of sucrose on ScrR, and it was also an exploratory test. This innovative verification method shows us the mysterious molecular world, and also reveals that we can use molecular docking technology to do more line tests in the future, which is helpful to find potential molecular targets for optimization projects.

  1. Wang, H., Liu, Z., Liu, Z., Wang, X., & Wang, J. (2022). GIS-based analysis on the spatial patterns of global maritime accidents. Ocean Engineering, 245, 110569.
  2. Zhou, K., Xing, W., Wang, J., Li, H., & Yang, Z. (2024). A data-driven risk model for maritime casualty analysis: A global perspective. Reliability Engineering & System Safety, 244, 109925.
  3. Mantas, C. J., Castellano, J. G., Moral-García, S., & Abellán, J. (2019). A comparison of random forest based algorithms: random credal random forest versus oblique random forest. Soft Computing (Berlin, Germany), 23(21), 10739–10754.
  4. Yuan, P., Lv, Z., & Zhou, G. (2014). Growth kinetics of 3 species of microalgae treated with petroleum hydrocarbon. Hai Yang Ke Xue, 38(10), 46–51.
  5. Tian, T., Wang, K.-Y., Liang, G.-Y., Jin, Y.-B., Li, J., Shao, X.-X., & Xu, Y.-C. (2018). Effects of immobilization process on physiological and biochemical characteristics of Synechococcus. WēIshēNgwùXué TōNgbào, 45(9), 1972.
  6. Hubert, A., Aquino, T., Tabuteau, H., Meheust, Y., & Le Borgne, T. (2020). Enhanced and non-monotonic effective kinetics of solute pulses under Michaelis-Menten reactions. Advances in Water Resources, 146, 103739.
  7. Wu, F., Liang, Y., Zhang, Y., Huo, Y., & Wang, Q. (2020). Construction of seamless genome editing system for fast-growing Vibrio natriegens. Sheng Wu Gong Cheng Xue Bao = Chinese Journal of Biotechnology, 36(11), 2387.
  8. Bernstein, J. A., Lin, P. H., Cohen, S. N., & Lin-Chao, S. (2004). Global analysis of Escherichia coli RNA degradosome function using DNA microarrays. Proceedings of the National Academy of Sciences of the United States of America, 101(9), 2758–2763.
  9. Zhang, W., & Hou, X. (2018). Estimation Algorithm of Atmospheric Light Based on Gaussian Distribution. Ji Suan Ji Ke Xue, 45(4), Ji suan ji ke xue, 2018-01, Vol.45 (4).
  10. Gayoso, A., Weiler, P., Lotfollahi, M., Klein, D., Hong, J., Streets, A., … Yosef, N. (2024). Deep generative modeling of transcriptional dynamics for RNA velocity analysis in single cells. Nature Methods, 21(1), 50–59.
  11. Ailin, C., Zaizhou, L. I., Qipeng, H. E., Xiaoyang, H. E., Zhisheng, W., & Nianyu, Z. (2018). Study on Matching Value of Illumination in Night Lighting Environments. 2018 15th China International Forum on Solid State Lighting: International Forum on Wide Bandgap Semiconductors China, SSLChina: IFWS 2018, 123–126.
  12. Cope, S., Chan, K., & Jansen, J. P. (2020). Multivariate network meta-analysis of survival function parameters. Research Synthesis Methods, 11(3), 443–456.
  13. Zhang, J., & Zhu, Y. (2022). Maritime search and rescue probability of containment model design and simulation. Journal of Physics: Conference Series, 2258(1), 12040.
  14. Huang, M. D. (2014). On the Prime Time for Rescue at Sea in Distress. China Maritime Safety, 12, 40-42.
  15. Reiweger, I., Genswein, M., Paal, P., & Schweizer, J. (2017). A concept for optimizing avalanche rescue strategies using a Monte Carlo simulation approach. PloS One, 12(5), e0175877.
  16. Shen, Y. (2024). The Unique Advantages and Applications of Helicopter Rescue at Sea. Pearl River Water Transport, 4, 114-116.
  17. Zhang, J. B., Guo, W. P., Xia, M., Yang, K. C., & Li, W. (2018). Design and Implementation of an LED Navigation System to Assist Ship Entry into Port. Acta Optica Sinica, 38(10), 310-318.
  18. Selvam, R., Lim, I. H. Y., Lewis, J. C., Lim, C. H., Yap, M. K. K., & Tan, H. S. (2023). Selecting antibacterial aptamers against the BamA protein in Pseudomonas aeruginosa by incorporating genetic algorithm to optimise computational screening method. Scientific Reports, 13(1), 7582.
  19. Heliere, F., Lin, C.-C., Corr, H., & Vaughan, D. (2007). Radio echo sounding of Pine Island Glacier, West Antarctica; aperture synthesis processing and analysis of feasibility from space. IEEE Transactions on Geoscience and Remote Sensing, 45(8), 2573–2582.
  20. Periola, A., Alonge, A., & Ogudo, K. (2023). Space-Based Data Centers and Cooling: Feasibility Analysis via Multi-Criteria and Query Search for Water-Bearing Asteroids Showing Novel Underlying Regular and Symmetric Patterns. Symmetry (Basel), 15(7), 1326.
  21. Scientific Platform Serving for Statistics Professional 2021. SPSSPRO. (Version 1.0.11)[Online Application Software]. Retrieved from https://www.spsspro.com.