| UZurich - iGEM 2024

Aim of our model

The dry lab played a crucial role in advancing our project by complementing wet lab experiments. We used computational models to simulate biological processes, predict potential outcomes, and provide deeper insights into complex systems.

Initially, we considered a variety of approaches, including the use of diffusion models to study biofilm expansion and simulations of plant growth with and without our engineered bacteria. However, we realized that focusing on the metabolic alterations in our bacteria after induced to produce a hyper-robust biofilm is crucial. This area is where computational modeling can significantly advance our project's progression. Specifically, our objective was to gain insight into the growth costs associated with biofilm production, its limits, and how the bacterial metabolic network behaves under different realistic conditions, including laboratory media, bulk soil, rhizosphere environments, and drought.

Flux Balance Analysis (FBA) proved to be a valuable tool to address these questions. It is a method of simulating the metabolism of mostly single-celled organisms and offers insights into how our genetically modified bacteria will perform in agricultural applications, which cannot be investigated in field experiments due to Swiss regulations.

We have made all our code available here, including all scripts, data, graphs, and tables so that future iGEM teams and other interested parties can utilize the scripts for their analyses or as a starting point on how to work with FBA.

What is Flux Balance Analysis?

Overview

FBA is used to reconstruct genome-scale metabolic networks to study the biochemical processes within a cell and to predict its growth based on the maximization of a biologically significant objective function, usually biomass production1. A metabolic network maps the chemical reactions within an organism that convert nutrients into energy and synthesize the essential building blocks required for growth and maintenance. These models include all known metabolic reactions and the genes encoding the enzymes required to catalyze those reactions. A genome-scale metabolic model (GEM) includes all the metabolically relevant genes for the organism of interest. It contains a set of m metabolites, one of n stoichiometric reactions between these metabolites, and the rules linking genes to their corresponding reactions and their flux, meaning the rate of flow of molecules through a metabolic reaction. Defining and modeling a GEM is the first step in FBA.

Figure 1: Workflow used when working with FBA. Created using BioRender.com

The second step involves making some key assumptions about the model’s functions and structure. The most important assumption in FBA is the steady-state assumption. This means that we assume the concentrations of all metabolites to remain constant over time. This makes it possible to solve the optimization problem using linear programming. Next, we define an expected flux range for each reaction, considering important details like reaction directionality and efficiency.

FBA requires an objective function to optimize for. Usually, an artificial ‘biomass production’ reaction is used as the objective function, because it is a good approximation to the growth rate of the organism. All the flux calculations aim to maximize this process.

In a third step, we need to specify the so-called exchange reactions responsible for importing and exporting nutrients to and from the cell, to simulate the environment in which our organism grows. They vary significantly under different conditions and must be considered with great care to ensure the model produces accurate and reliable results.

Formal Definition

In this section we will look in more detail at the mathematical representation of the metabolism and the computational methods used in an FBA. This is mainly intended for readers who are particularly interested in the computational background.

Formally, the steady state assumption is described as follows:

Let $v_i \in \mathbb{R}$ be the flux through the $i$-th reaction, where $i \in \{1,...,n\} \subset \mathbb{N}$, usually in $\frac{mmol}{gDW \cdot h}$, $x$ a metabolite with concentration $[x]$, and $s_i \in \mathbb{R}$ be the stoichiometric coefficient of in the reaction. Then $$\sum_{i=0}^n s_i \cdot v_i = \frac{d}{dt} [x] = 0.$$ Since we can write a stoichiometric $m \times n$ matrix $\mathbf{S}$ containing the coefficient for each metabolite and coefficient and $\mathbf{v} \in \mathbb{R}^n$ a vector containing all the fluxes, we get $$\mathbf{Sv} = 0.$$ The additional assumptions for the bounds of each reaction flux are defined in two vectors $\mathbf{a},\mathbf{b} \in \mathbb{R}^n$. The first one contains all lower bounds and the second all upper bounds. The objective function $f(\mathbf{v})$ is what the FBA optimizes by solving for the flux distribution in the network. The problem solved by the FBA is thus described by $$\max_{\mathbf{v} \in \mathbb{R}^n} f(\mathbf{v}), \text{ such that } \mathbf{Sv}=0,$$ $$a_i \leq v_i \leq b_i \ \forall i \in \{1,...,n\}$$

Let’s consider a simple example to make this concept easier to understand:

Figure 2: Simple example of a metabolic network. Created with BioRender.com

In this example we have five metabolites in the internal compartment $(A, B, C, D, E)$ and four metabolites in the external compartment $(A\_e, B\_e, C\_e, E\_e)$. There are seven reactions involving these metabolites: four exchange reactions $({EX}\_A, {EX}\_B, {EX}\_C, {EX}\_E)$, which transport metabolites between the internal and external compartments, and three internal reactions $(R_1, R_2, R_3)$, where metabolites are converted into other metabolites.

Instead of having this long list of reactions, we can very efficiently describe this network in a stoichiometric $m \times n$ matrix $\mathbf{S}$, where each row represents a metabolite, and each column represents a reaction. Often there are a few other reactions added to $\mathbf S$, one of which is the biomass reaction, which is not a real biochemical reaction but rather a summary of several reactions that describe the growth rate of the organism.

Figure 3: The metabolic network is converted into a stoichiometric matrix, where the columns represent the reactions present in the network and the rows represent the metabolites. Some artificial columns may be added (for example biomass). Created with BioRender.com

$EX\_A: \quad A\_e$ → $A$
$EX\_B: \quad B\_e$ → $B$
$EX\_C: \quad C\_e$ → $C$
$EX\_E: \quad E\_e$ → $E$

$R_1: \quad 3A$ → $D$
$R_2: \quad B+2C$ → $D$
$R_3: \quad D$ → $E$

This matrix defines a set of linear equations which can be solved by relatively simple algorithms optimizing for the maximal flux in the objective function, meaning the highest achievable rate of the desired metabolic process under the given constraints.

Earlier, we mentioned solving a linear programming problem. But what does that actually mean? In our project, we used the Python library COBRApy, which relies on a solver called the GNU Linear Programming Kit (GLPK). GLPK uses the Simplex algorithm at its core. This algorithm works by iterating along the edges of the feasible solution space—determined by the constraints of the metabolic network—to find the optimal solution, such as maximizing biomass production. The solver then returns the flux values we are interested in analyzing.

Figure 4: The Simplex algorithm efficiently maximizes the objective function within the allowable solution space. Created with BioRender.com

For more detailed information consult the COBRApy documentation.2

Project Structure

We divided our modeling process into different stages to have a more manageable overview over the project. Each stage increases in complexity, providing a deeper understanding of the metabolic behavior of our bacteria across various perspectives and conditions.

Stage 0: Preliminary tests with the iJN1463 model.3
Stage 1: Initial analysis of growth rate under biofilm production.
Stage 2: Gene knockout assay to improve biofilm production.
Stage 3: Adapted experiments from stage 1 and 2 in different conditions, including M9 glucose minimal medium, Lysogeny Broth (LB) medium, soil, rhizosphere, and drought conditions.
Stage 4: Impact of nutrient variability on biomass production.

Stage 0: Exploring our GEM

The GEM iJN1463

Microbial metabolic networks show both shared pathways and significant species-specific variations in nutrient utilization, absorption, metabolism, and biomass production4. For Pseudomonas sp. IsoF, the bacterium employed in our RhyzUp project, no genome-scale metabolic model (GEM) is available, and only a limited number of metabolic pathways have been researched. Given the infeasibility of constructing a new metabolic network without extensive experimental validation, we opted to use a GEM from a related species, Pseudomonas putida. P. putida is frequently used as a reference organism in experiments involving Pseudomonas sp. IsoF, particularly the well-characterized strain P. putida KT2440, which is also the case in the laboratory our wet lab team is working at.

Our FBA is based on the publicly accessible iJN1463 model of P. putida KT24403. This comprehensive metabolic reconstruction incorporates 1462 genes, 2927 reactions, and 2153 metabolites, representing a significant expansion of the reactome compared to previous models5. The model's robustness is further enhanced by its validation through in-vivo growth screens, making it an ideal foundation for our analysis. While our experimental design and analysis are based on P. putida KT2440, the computational scripts developed for this project could be readily applied to a Pseudomonas sp. IsoF GEM, should one become available in the future. This approach allows for flexibility and potential expansion of our research as more species-specific data becomes accessible.

How to test a biofilm?

The biofilm of Pseudomonas putida KT2440 is primarily composed of the structural proteins LapA and LapF, along with the exopolysaccharides cellulose, alginate, and P. putida exopolysaccharides A (Pea) and B (Peb)6. There is no specific biofilm forming reaction in the model because these components are secreted outside the cell to form the biofilm structure externally. As a result, we do not analyze the biofilm as a whole, but individual biofilm components.

Although LapA is a crucial biofilm matrix component in P. putida, its regulation involves cyclic diguanylate (c-di-GMP) binding to LapD, which modulates LapG activity and controls LapA's presence on the cell surface, rather than direct metabolic pathways6. The metabolic pathways for Pea and Peb exopolysaccharides in P. putida KT2440 are not yet sufficiently known to incorporate them meaningfully into our model. Future research should aim to integrate Pea and Peb into the model, including their specific production ratios, to more accurately represent the biofilm composition of P. putida KT2440.

Therefore, we focus our analysis on cellulose and alginate, two components with well-defined metabolic pathways. For cellulose there is only one exchange reaction present in the model (ID=EX_cell4_e).

Alginate has several exchange reactions in the model, so we researched which target reaction would be most useful. Alginate is an important biofilm component, particularly under water-limiting conditions7. Alginate biosynthesis involves distinct metabolic pathways for the production and incorporation of differentially linked and acetylated subunits. We decided to use the reaction between the activated precursor GDP-mannuronic acid, a key regulatory step in the alginate pathway, and prealginate (Reaction ID=PALGSKT)8, rather than on the exchange reactions of alginate subunit. This approach allowed us to consider different alginate variants that may be produced in response to varying environmental contexts. As acetyl-CoA is a substrate for the biomass reaction, it is a limiting factor. This suggests that under nutrient-limited conditions, our model may favor alginate pathways that are less dependent on acetylation. The extent to which P. putida KT2440 relies on acetylated alginate should be tested in vivo for further refinement of the model.

Figure 5: Metabolic pathway of cellulose of P. putida KT2440. Visualization was generated using the Escher tool.

Figure 6: Metabolic pathway of alginate of P. putida KT2440. Visualization was generated using the Escher tool.

Adding c-di-GMP

Figure 7: Metabolic pathway of c-di-GMP of P. putida KT2440. Visualization was generated using the Escher tool.

Second messengers are typically not included in FBA as they are not metabolized in the traditional sense and often serve regulatory or signaling functions rather than metabolic ones. However, in our project, we aim to increase c-di-GMP production. To understand how c-di-GMP fits into our metabolic network and what metabolic costs its production would incur for the bacteria we implemented the following reactions:

Diguanylate cyclase: 2GTP -> 1 c-di-GMP + 2 ppi
Phosphodiesterase: 1 c-di-GMP + 1 H2O -> 1 PGPG + 1 H+
Oligoribonuclease: 1 PGPG -> 2 GMP9

Stage 1: Cost Analysis of Biofilm Components on Growth Rate

The first experimental step with our model was to evaluate the cost of biofilm production by assessing its impact on the growth rate of Pseudomonas putida KT2440 using the preset condition of the model. This was done by analyzing the biomass exchange reaction (ID = BIOMASS_KT2440_WT3) and focusing on two key biofilm components: cellulose and alginate (details in Stage 0). When optimizing the biomass reaction under the given conditions, no biofilm components are produced. Therefore, we set the lower bounds of cellulose and alginate reaction to the value 0 and then linearly increased it, thereby forcing the bacteria in our model to secrete these biofilm components. For each increment, the model was reoptimized to maximize the biomass production and the associated flux value is saved. The model was used in the preset conditions when downloaded from BiGG models.3

Dependence of Biomass Production on Cellulose Extrusion

Figure 8: Cellulose production has an impact on biomass and the relationship is linear.

Dependence of Biomass Production on Alignate Extrusion

Figure 9: Alginate production has an impact on biomass and the relationship is linear.

The biologically significant range for cellulose extrusion, from 0 mmol/gDW/h to 1.37 mmol/gDW/h, and for alginate extrusion, up to 1.26 mmol/gDW/h, is linear. Biomass production should never have a negative value, as a decrease in biomass would indicate cell death, so values below 0 were not being considered. The linearity may be caused by the simplicity of the available nutrients, as here glucose is the main carbon source in the medium. In Stage 3, we repeated the experiments under different, more complex nutrient conditions (see Stage 3 for details). From this first analysis we can see the direct correlation between increased biofilm production and decreased biomass production, indicating that producing biofilm components is quite costly for the bacteria. The linear relationship suggests a trade-off, as glucose is either used for biomass growth or diverted to produce biofilm components like cellulose and alginate.

Dependence of Biomass Production on c-di-GMP Production

Figure 10: Cyclic di-GMP is an important second messenger involved in biofilm production. The production can be expensive, but usually c-di-GMP is not produced in large enough quantities to impact the biomass directly.

In the same context, we conducted this experiment for c-di-GMP and observed that its production decreased biomass in a linear manner, similar to cellulose and alginate. This is expected, as GTP, an energy molecule, is consumed to produce c-di-GMP. However, the impact of c-di-GMP on biomass is less pronounced compared to cellulose and alginate (with a less steep decline). It is important to note that, as a second messenger, c-di-GMP is produced in much lower quantities—typically in the nM, low µM range10—so it is not a major factor in limiting biomass.

Stage 2: Gene Assays

In a collaborative project between wet and dry lab, one of the key responsibilities of the dry lab is to identify areas where the wet lab's processes can be optimized or advanced, as well as finding promising experimental approaches. For stage 2, we performed a gene assay to evaluate whether the metabolism of P. putida KT2440 could potentially be improved by knocking out genes, thus freeing up energy for biofilm production.

In the first step, we knocked out each gene individually and identified those whose deletion did not result in immediate death of the cell. We used this approach to reduce the number of genes to analyze, thereby minimizing computational time as much as possible. This resulted in a list of 1138 genes that are likely nonessential for the survival of P. putida KT2440.

knockout gene id	cellulose flux
PP_1612	7.77905e-18
PP_0338	-5.25294e-17
PP_5056	-6.16302e-18
PP_4193	-1.28124e-16
PP_4186	8.704747e-17

Table 1: Genes associated reactions with an non-zero change in cellulose flux upon knockout.

Next, we recorded the reference value for cellulose production (reaction ID: EX_cell4_e) and systematically knocked out each gene to analyze changes in flux relative to the reference. Five genes had an effect on cellulose production while the biomass flux remained unchanged. These five genes were used for further analysis where they were knocked out in every possible combination to see if positive effects would be enhanced and if significant biomass production was still maintained. Unfortunately, the effects we saw in the single gene knockouts were very small (-1.28e-16 mmol/gDW/h, 8.70e-17 mmol/gDW/h) and did not significantly improve the cellulose production.

We conclude from the results that the metabolism of P. putidaKT2440 is highly efficient, and it cannot be improved by simply knocking out a part of it. For future investigation we would suggest working on some knock-ins as addition to the metabolic network. We revisited these experiments in different media during stage 3 to see if knockouts could improve the efficiency in more complex environments.

Stage 3: Experiments in Soil Conditions

Early on in the process of finding the dry lab project, we recognized the value of simulating soil conditions to bridge the gap between the wet lab experiments and real-world applications. Given that field trials with genetically engineered bacteria are prohibited in Switzerland, simulating realistic conditions became the only way to obtain "field data”.

Our dry lab team simulated various application-oriented conditions and compared the performance of our organism as biofilm production was increased under each setting. We focused on the following conditions:

M9 glucose minimal medium: A simple medium with glucose as the only carbon source. We used the in silico M9 glucose minimal medium defined by J Nogales et al. 5, but removed additionally CO2 and bicarbonate as available nutrients to ensure that glucose is the only carbon source the bacteria could metabolize. This single-carbon-source approach simplifies the environment, facilitating clearer insights into metabolic behavior and growth effects.
Lysogeny Broth (LB) medium: A nutrient rich medium commonly used for growing bacteria. This was the medium used by the wet lab and therefore simulates our lab conditions. We used the in silico LB medium defined by J Nogales et al.5.
Soil medium: Simulates bulk soil conditions. The in silico soil is modeled after the Soil Defined Medium (SDM2) defined by S. Jenkins et al.11. We estimated the conversion between mM and mmol/gDW/h by comparing the nutrient composition for LB medium with the definition for a in silico LB medium in the original paper describing the model iJN1463 5. This condition simulates the environment where the bacteria are present in the soil but not in close proximity to plant rhizospheres. To ensure the survival of our in silico organism, which means that the biomass production flux is non-zero, we had to add cysteine, nickel and selene. These three nutrients were not included in SDM2, but in M9 glucose minimal and LB medium.
Rhizosphere medium: Simulates conditions close to plant roots. We adapted the soil conditions to reflect the changes in nutrients measured in close proximity to roots, where carbon levels are higher, organic acids from plant exudates are present, nitrogen and potassium availability is increased, and phosphorus levels are reduced12^,13.
Drought medium: Simulates conditions close to roots but under drought stress. S.M. Geng et al. estimated severe drought to be at about 35% relative soil water content 14 and L. Deng et al. have measured changes of soil components under drought stress 15. We used their data to model the rhizosphere under drought conditions.

We made a script that helps read in these conditions and automatically changes the lower bound of the exchange reactions (link to script), which makes adjusting the model to other conditions easier. Link

For each of these new models we ran adapted versions of the previous in silico experiments (See stage 1 and stage 2. The six experiments include:

Cellulose cost analysis
Alginate cost analysis
Combined cellulose and alginate cost analysis
Gene assay to identify non-essential genes
Gene assay to identify genes to improve cellulose production
Gene assay to identify genes to improve alginate production

The following section focuses on the combined cost analysis and gene assays for improved cellulose and alginate production. All data, graphs and scripts used in the analysis are available on our repository.

Figure 11: Cellulose and alginate production have a rather linear impact on biomass in M6 glucose minimal medium.

Figure 12: Cellulose and alginate production have a nolinear effect on biomass production. This might have to do with the complexity of the medium.

As expected from the results of stage 1 and the similarities between the preset of the iJN1463 model and the M9 glucose minimal medium, the relationship between alginate, cellulose and biomass is highly linear. This outcome is not surprising, as glucose serves as the sole carbon source for all three reactions and there is no metabolic flexibility in this simple medium. When the organism is forced to produce cellulose and/or alginate, the substrate is no longer available to produce biomass.

In the more nutrient-rich and complex LB medium, we observe a significant difference. Biomass yield is higher, and more biofilm components can be produced. It is about three times higher when no biofilm is produced. Also, cellulose or alginate production can be pushed to values about three times higher compared to the glucose minimal medium condition.
The relationship between cellulose, alginate and biomass in LB medium is not linear. The production rates of pre-alginate and cellulose can each be increased up to 1.03 mmol/gDW/h. Further increase would result in higher cost of biomass per unit of biofilm-component synthesized. At this threshold, biomass production is reduced to 1.10 mmol/gDW/h, 66% of its maximum.

Cost of Biofilm Components in Soil Medium

Figure 13: Cellulose and alginate production have a rather linear impact on biomass in soil.

Cost of Biofilm Components in Rhizosphere Medium

Figure 14: Cellulose and alginate production have a rather linear impact on biomass in the rhizosphere.

Cost of Biofilm Components in Drought Medium

Figure 15: Cellulose and alginate production have a rather linear impact on biomass under drought conditions.

The graphs for soil, rhizosphere and drought conditions look very similar. They show a largely linear relationship with biomass production peaking at approximately 26 µmol/gDW/h. All conditions also show comparable levels of biofilm production before growth drops below 0. From these results, we conclude that the changes between the different conditions do not have a great effect on our modified bacteria. Although this model does not account for changes in bacterial social behavior and other complex changes between these conditions, the results demonstrate that P. putida KT2440 is metabolically resilient to environmental stress, making it a strong candidate for combating drought and enhancing plant growth.

Interestingly, when optimizing the biomass function, we observed that P. putida KT2440 does not take up water across different conditions, which explains its resilience to drought. This observation is based on the analysis of metabolites taken up and secreted under varying conditions (Link to repository - see FBA_different_conditions.xlsx). This may seem surprising, but they align with the assumptions and constraints established within our model. Additionally, we found that amino acids serve as the sole nitrogen sources for the bacteria, highlighting a strong dependence on them, as they also play a major role in carbon flux.

To better visualize the scale in differences between maximal biomass production capacity in each environment we combined the cellulose production data in one graph.

Impact of Cellulose Production on Biomass in Different Conditions

Figure 16: The production of biofilm and biomass is highly influenced by environmental conditions. In nutrient-rich, complex environments, large quantities of both are produced, whereas in simpler media, their production is significantly reduced.

In the gene assays, we found no improvements for cellulose or alginate by knocking out single genes in either the M9 glucose minimal or LB medium. Interestingly, we found multiple genes improving cellulose production in each of the assays for soil, rhizosphere, and drought conditions without negatively impacting biomass production.

Soil
# non-essential genes	1266
# cellulose improving genes	8
# alginate improved genes	0

knockout gene id	flux (mmol/gDW/h)
PP_5289	0,309
PP_4481	0,309
PP_5128	0,919
PP_1986	0,982
PP_1985	0,982
PP_4678	0,919
PP_5155	1,103
PP_4909	0,435

Table 1: Promissing genes for cellulose produciton in soil condition.

Rhizosphere
# non-essential genes	1266
# cellulose improving genes	7
# alginate improved genes	0

knockout gene id	flux (mmol/gDW/h)
PP_5289	0,558
PP_4481	0,776
PP_1986	1,080
PP_1985	1,080
PP_1025	1,080
PP_4678	0,341
PP_5155	0,532

Table 2: Promissing genes for cellulose produciton in rhizosphere condition.

Drought
# non-essential genes	1266
# cellulose improving genes	5
# alginate improved genes	0

knockout gene id	flux (mmol/gDW/h)
PP_5128	0,965
PP_1988	1,035
PP_1986	1,035
PP_1985	1,035
PP_5155	0,404
PP_4909	0,435

Table 3: Promissing genes for cellulose produciton in drought condition.

Genes that have a positive impact in all soil conditions:
PP_1986	3-isopropylmalate dehydratase small subunit
PP_1985	3-isopropylmalate dehydratase large subunit
PP_5155	D-3-phosphoglycerate dehydrogenase

Table 4: Genes to be considered in future research

There are three genes that have a positive effect in all soil conditions on cellulose production when knocked out and therefore are promising targets for the wet lab to investigate. The first two, PP_1985 and PP_1986) are part of the same dehydratase, catalyzing the isomerisation of 2-isopropylmalate to 3-isopropylmalate, and therefore it makes sense to knock them out together.16

Stage 4: Impact of nutrient variability on biomass production

We evaluated the model's sensitivity to nutrient variations to assess whether supplementation would benefit field applications or not. For the initial analysis, we selected phosphorus and potassium for being common in fertilizers, as well as calcium and magnesium due to their significant flux patterns observed together with the water uptake in Stage 3.

Effect of Phosphorus Availability on Biomass

Figure 17: Phosphorus saturated very quickly.

Effect of Potassium Availabilitv on Biomass

Figure 18: Potassium saturated very quickly.

Effect of Magnesium Availability on Biomass

Figure 19: Magnesium saturated very quickly.

Effect of Calcium Availability on Biomass

Figure 20: Calcium saturated very quickly.

Although nitrogen is also a key fertilizer component, Stage 3 revealed that the bacteria expel large amounts of ammonium (NH₄⁺) and prefer amino acids as their nitrogen source. The model tested uptake from 0 to 1000 µmol/gDW/h. Each nutrient quickly reached saturation, providing minimal growth benefit beyond that point. Therefore, supplementing the soil with these nutrients is likely to be unproductive for the growth of P. putida KT2440.

A clear correlation between the limitation of amino acid and polysaccharide uptake and biomass production is evident.

Effect of Phenylalanine Availability on Biomass

Figure 21: Phenylalanine, as an amino acid, improves biomass production over a large range.

Effect of Tryptophan Availability on Biomass

Figure 22: Tryptophan, as an amino acid, improves biomass production over a large range.

Effect of Spermidine Availability on Biomass

Figure 23: Spermidine, as a polysaccharide, enhances biomass production across a wide range of conditions.

While supplementing both could be beneficial, their rapid metabolization by rhizobial bacteria would require frequent applications.
Instead, we recommend applying a highly concentrated version of our product as close to the roots as possible to enhance the likelihood of colonization, rather than relying on extensive nutrient supplementation.

Assumptions and Limitations

As it is impossible to perfectly replicate real-world conditions in a model, we made assumptions to approximate the system.
First we adopted the standard assumptions used in FBA (See What is Flux Balance Analysis). These include the implicit assumption that the metabolic network underlying the model is comprehensive enough to accurately represent our organism of interest, P. sp. IsoF and the steady-state assumption necessary to solve the linear programming problem in FBA.

As mentioned in Stage 0, there is no available model for Pseudomonas sp. IsoF, therefore in dry lab we worked with a model of the P. putida KT2440 strain, namely iJN1463. Furthermore, we assumed that all the organisms try to maximize their growth and therefore we choose biomass production (ID = BIOMASS_KT2440_WT3) as the objective value.

Due to the heterogeneous nature of soil and rhizosphere environments and the resulting data complexity, we assumed that drought has the same effects in both the rhizosphere and bulk soil. Furthermore, given the complex structure of biofilms and their dependence on environmental factors, we focused only on the two main components: cellulose and alginate. Additionally, our simulation of a single cell metabolic network does not include the effect bacterial community interactions can have.

Given the heterogeneous nature of soil, we used a single in silico soil in our experiments, but due to the wide variation in soil composition based on land use and location, we made necessary simplifications to build our model.

Discussion

Throughout the modeling of RhyzUp, we made several key observations, which we would like to highlight and summarize in this section. Utilizing FBA to assess the cost of hyper-robust biofilm production proved to be a valuable approach (See Aim).

In the first stage of our project, we demonstrated that biofilm production is metabolically costly for bacteria, directly reducing biomass production. Given the trade-off between biofilm formation and biomass growth, symbiotic relationships with plants must provide substantial benefits to be advantageous for bacteria. In the case of P. putida KT2440, being a rhizobacteria, these benefits clearly outweigh the associated metabolic costs. Therefore, the wet lab's focus on producing a hyper-robust biofilm only when the bacteria are near the plant roots, where it has the strongest benefit for plants, is well-founded.

In the second stage, we observed that P. putida KT2440 exhibits high metabolic efficiency even in simple media with limited carbon sources. Consequently, knocking out individual genes does not enhance cellulose or alginate production. To achieve a more efficient metabolism, newly introduced genes would need to be added to enable new pathways.

As we shifted our focus to more complex media, particularly in the third stage, we demonstrated that both the cost and maximum quantity of biofilm production are highly dependent on the nutritional complexity of the environment. We observed significant differences between simple and complex media, such as M9 glucose minimal medium and LB medium, as well as between laboratory and natural conditions. The nutritionally rich LB medium allowed the bacteria to produce the most biomass and biofilm components. The evaluated range of up to 1.03 mmol/gDW/h cellulose and alginate production can assist the wet lab in fine-tuning gene expression strength, allowing them to take advantage of the reduced metabolic cost of biofilm production within that range.

Conversely, there are only minor variations among different soil conditions. The biomass production and the costs associated with increased biofilm are comparable across bulk soil, rhizosphere, and drought conditions, suggesting that P. putida KT2440 is well-suited to withstand drought while enhancing plant resilience. Given that the results indicate improved performance in nutritionally richer environments, we anticipate that the bacteria will naturally migrate towards the roots, benefiting the wet lab's construct by remaining concentrated in close proximity to them. In the gene assay conducted under more complex conditions, we successfully identified a list of genes that the wet lab could further investigate to enhance cellulose production in soil conditions.

Finally, in the last stage, we observed that polysaccharides and amino acids significantly impacted biomass production, while other essential nutrients did not enhance growth beyond a low saturation threshold. Consequently, we concluded that active supplementation would be inefficient, as key nutrients like sugars and amino acids are metabolized quickly and would require frequent reapplication, leading to increased labor on the side of farmers.

In conclusion, by using our model, we were able to find multiple ideas that could be implemented by the wet lab and can be researched in follow-up studies. Furthermore, we laid a path for future teams trying to work with simulations in soil, proving that it is possible to model soil conditions and show how defining and implementing conditions can be approached. We also found plenty of data supporting the wet lab’s design decisions, showing that our construct makes sense and is based on a solid scientific foundation.

Outlook

To improve on the limitations outlined earlier (See Limitations), future research should prioritize the development of a GEM for Pseudomonas sp. IsoF and focus on collecting more experimental data under defined conditions, including soil, rhizosphere, drought, and potentially biofilm environments. Given the inherent heterogeneity of soil ecosystems, we recommend repeating the experiments from stages 3 and 4 using different in silico soil models. It would be particularly valuable to compare results from soils associated with diverse land-use types, such as arable land, settlements, forests, meadows, and alpine grasslands. Our SDM2 model provides a solid foundation for this work, and its flexible design means that incorporating new soil types would primarily require access to detailed soil composition data for each land use.

Additionally, to better assess the competitiveness of Pseudomonas sp. IsoF, we propose conducting competition tests using FBA on common rhizosphere bacteria. By comparing the growth rates of different species under the same conditions applied to P. putida KT2440 (See Stage 3 for soil design details), we can use growth rate as a proxy for competitive ability. Faster-growing bacteria typically have an advantage when competing for limited resources like nutrients and space, reflecting greater efficiency in resource utilization.

Our preliminary search has identified several potential models for inclusion in this analysis, such as iCN718 (Acinetobacter baumannii AYE)17, iYO844 (Bacillus subtilis subsp. subtilis strain 168)18, and iJYQ746 (Bacillus amyloliquefaciens)19.

However, it is important to note that this estimation may be conservative, as Pseudomonas sp. IsoF possesses a distinct competitive advantage due to its Type IVB Secretion System (T4BSS). This system allows P. sp. IsoF to kill a wide range of Gram-negative bacterial competitors by delivering toxic effectors in a contact-dependent manner20.

Looking ahead, integrating these competition experiments into a Pseudomonas sp. IsoF model, along with varying soil types, will enhance our understanding of microbial dynamics and further clarify the role of Pseudomonas sp. IsoF in agricultural ecosystems, particularly in biofilm production.

Update

Between the wiki freeze and the Grand Jamboree, we continued working on our models and achieved additional results, which we present in this section.

We focused on the competition test described in the outlook section. As a proof of concept, we selected two GEMs of soil bacteria, a gram-positive and a gram-negative strain, that our bacterium would likely compete with in field applications. For the gram-positive bacterium, we used Bacillus subtilis str. 168 and used its GEM, iYO84418, and for the gram-negative bacterium, we chose Xanthomonas phaseoli pv. manihotis , using its GEM, iXpm155621. We adapted both models to our established soil conditions in stage 3 and analyzed their biomass production in the given environment. Assuming that growth rate is a reasonable proxy for competitiveness - where a bacterium gains a competitive advantage by outgrowing others - we compared their biomass production levels to those of our pre-established P. putida KT2440 model.

Dependence of Biomass Production on Cellulose Excretion in the Rhizosphere

Figure 24: P. putida KT2440 shows a metabolic competition advantage to two other soil bacteria even when producing the biofilm component cellulose.

As shown in Figure 24, the P. putida KT2440 model outperforms both competitors and can even be induced to produce cellulose before the cost becomes too high and the growth rate falls below that of the others. This result is very promising, but it requires further testing and verification/validation from the wet lab.