Part 1: Protein Engineering Module: Molecular dynamics simulation of TFD and rare earth coordination
1.1 Overview
TFD-EE is a C2-symmetric dimer protein capable of adsorbing rare earth ions, with a total of eight β-sheet in the barrel, and four C4-symmetric glutamates located on four separate β-sheet (Figure 1). Lanthanide trivalent cations are usually oxyphilic and can form coordination bonds with two oxygen atoms on the side chain of glutamate residues. In the presence of eight ligand oxygen atoms, lanthanide ions can be tightly bound to the protein cavity.
To test for protein binding, Shane J. Caldwell et al.[1] introduced a tryptophan (N6W mutation) as a sensitizer antenna effect to detect Tb(III) luminescence [2]. That is, when the protein binds Tb(III), the excitation wavelength of 280 nm is given, and the indole ring on tryptophan can absorb energy and transfer it to Tb(III), thus making Tb(III) emit characteristic fluorescence, and the adsorption of Tb(III) can be quantitatively detected according to the fluorescence intensity.
To facilitate subsequent experiments, we fused this dimer protein into a single-stranded TIM-FD-Single (TFD-S) and gave the sequence to AlphaFold3 for prediction, finding that its spatial structure was almost identical to that of the TFD-EE protein. The characterization experiments showed that its performance was similar to that of TFD-EE dimer. Subsequently, site-specific mutation will be carried out on the basis of TFD-S.
Figure 1. Combination of TFD-EE and Tb(III)
Figure 2. Combination of TFD-S and Tb(III)
We hope to further modify the protein to enhance its ability to trap rare earth ions. There are two main ideas: one is to enhance the binding ability of the protein to a single rare earth ion, search for mutation sites near the original four glutamic acids, and introduce new glutamic acid, aspartic acid or asparagine residues to increase oxygen coordination. The second is to increase the number of rare earth ions that a single protein can capture, look for four sites on the same β-sheet that are similar to the original four binding sites, and mutate them into glutamate or aspartic acid.
After mutating the amino acid sequence on PyMol, we submitted the results to AlphaFold3 for prediction, and finally obtained eight possible mutants, all of which have the mutated amino acid side chains towards the internal cavity of TFD. In order to exclude mutants with reduced rare earth coordination ability, we decided to first simulate the binding of proteins and rare earth ions with molecular dynamics simulation (MD Simulation).
1.2 Molecular dynamics simulation
The crystal structure of TFD-S is predicted by AlplaFold3, which is a TFD
single-chain protein with a terbium ion. MD simulation was performed using CHARMM36 protein
Force Field 45 in the GROMACS2022.3 package 46. The force constant of the harmonic bias
potential is 1000 kJ∙mol-1∙nm-2. Set the 10 Å cuboid box where the edge is widest from the
protein, use the TIP3P water model in it, and add sodium ions to the solution to maintain
the neutrality of the system.
Eleven TFD-S-X(III) ( X(III) means Tb(III) or La(III) or Lu(III) ) systems
were simulated at 300 K. To begin with, the system is minimized by 50,000 steps using
the steepest descent algorithm. In the pre-balancing phase, the system is gradually heated
to 300 K at 1 atm using a 200ps NVT set and 200ps NPT set. A 50ns production run followed
to collect the balanced configuration for each 2ps interval. A speed scaling thermostat
49 with a time constant equal to 0.1ps was used to keep the temperature constant throughout
the simulation. To maintain the pressure, the Berendsen pressure coupler was used in the
NPT pre-balance run and the Parrinello-Rahman pressure coupler was used in the production
run, with the pressure time constant and isothermal compression rate set at 2ps and 4.5×10-5
bar-1, respectively. In the whole simulation process, the integral time step of the motion
equation is 2fs. The cut-off value for the non-bonding interaction is 12Å. The particle grid
Ewald algorithm 53 is used to calculate long range electrostatic interactions. [2]
Finally, the data results of 12 systems are obtained. The results showed that the average distance between the 8 active oxygen atoms of TFD-S and terbium ions was 2.41 Å (Figure 3). In the coordination structure, the distance between the ligand and the central atom can show the size of the binding force, but if the distance is too far, it is difficult to effectively bind. Among all the protein mutants, only the original four mutant TIM-FD-Change (TFD-C) with isoleucine mutation to glutamate at positions 6, 52, 175, and 221 was similar to 2.41 Å (Figure 5), so we believed that this mutation had the greatest possibility to achieve our goal.
In addition, in order to detect the binding of rare earth ions
to proteins, glutamine at site 154 and site 323 of TFD-C was mutated into
tryptophan, and TIM-FD-Mutation (TFD-M) was obtained, whose sequence is
shown below (Figure 4) [3]. The final MD results of Tb(III), La(III) and
Lu(III) ions are shown in the figure below (Figure 5).
Figure 3. Schematic diagram of binding of eight active oxygen atoms to terbium ions in TFD-S. The Tb(III) ion is shown as green sphere, and coordination bonds are shown as dashed lines. The numbers near the coordination bonds represent the distances between the atoms, with the unit of Å.
Figure 4. Comparison of four protein sequences
Figure 5. Schematic diagram of TFD-M combined with Tb(III), La(III), Lu(III), molecular dynamics simulation process, and kinetic curve. The ordinate of the RMSD curve is the root-mean-square deviation, indicating the distance the system moves.
1.3 Results
The molecular dynamics simulation results based on TFD-M, La(III),
Tb(III) and Lu(III) are satisfactory, and we believe that the probability of
TFD-M adsorbing rare earth ions is high. In addition, in order to test whether
the new site can adsorb rare earth ions, the original binding site of TFD-M
was mutated back to the original TFD state, that is, E35V/E158K/E204V/E327K
mutation was introduced into TFD-M. The resulting new mutant is called
TIM-FD-Reduction (TFD-R) (Figure 6). After adding terbium ions to the pdb
file, similar PyMol and molecular dynamics simulations were carried out, and
the final results were shown in Figure 7. Subsequently, we will conduct wet
experimental characterization of two mutants, TFD-M and TFD-R, as detailed
in our Results page.
Figure 6. Combination of TFD-R and Tb(Ⅲ)
Figure 7. Molecular dynamics simulation process of TFD-R combined with Tb(Ⅲ), and kinetic curve. The ordinate of the RMSD curve is the root-mean-square deviation, indicating the distance the system moves.
Part 2: Yeast Surface Display Module: Lanthanide adsorption kinetics and adsorption equilibrium modeling
2.1 Overview
After the lanthanide metal binding protein TFD was displayed on the surface of yeast cells to construct a whole-cell biosorbent, we focused on the mathematical modeling of the adsorption process of lanthanide metal ions by engineered yeast, including adsorption kinetics and adsorption equilibrium. We hoped to help us understand the dynamic process of rare earth adsorption through mathematical fitting modeling, evaluate the adsorption capacity of engineered yeast to lanthanide metal ions, and predict the optimal cycle of biosorption for feasibility verification and subsequent design.
2.2 Kinetic modeling of lanthanide adsorption
In the kinetic part of the adsorption of lanthanide metal ions,
we tried to use the pseudo-first-order and pseudo-second-order formulas
to fit the kinetic experimental data:\(\frac{dq}{dt} = k_1(q_e - q_t) \) and
\(\frac{dq}{dt} = k_2(q_e - q_t)^2\),( \(q\) and \(q_t\) are the adsorption signal strength,
\(q_e\) is the equilibrium adsorption signal strength), [5] these two kinetic models are
widely used to describe the experimental data of biomass adsorption of heavy metals.
[6] In the “kinetic” range before the adsorption reaches saturation, we refer to
the adsorption data of Tb3+ and Yb3+ by engineering strains expressing Lanthanide
Binding Tag (LBT) [7] or lanmodulin (LanM) [5] in the previous papers. We evaluated
the adsorption of lanthanide metal ions by the engineered strain similarly.
The adsorption of lanthanide metal ions is more in line with the pseudo-first-order
kinetic model, and the corresponding correlation coefficients (R2) value is higher.
2.3 Modeling of lanthanide adsorption equilibrium
In the equilibrium stage, we refer to the current mainstream equilibrium adsorption models: Langmuir and Freundlich isothermal models [8]. The Langmuir isotherm is an empirical model. It is assumed that the adsorption process occurs at a specific position on the surface of the adsorbent, resulting in a single-molecule adsorption layer. Different from Langmuir isotherm model, Freundlich model is not limited to monolayers, extending its application to the multilayer, reversible, and non-ideal adsorption processes.
The Langmuir isothermal equation is
\(q_e = \frac{K_LC_e}{1+a_LC_e}\),
( \(q_e\) is the concentration of lanthanide metal
ions in the equilibrium state of stem cells, \(C_e\) is the concentration of lanthanide
metal ions in the solution system, \(K_L\) and \(a_L\) are Langmuir isothermal constants).
The equilibrium saturated adsorption capacity of the engineering strain is
determined by fixing the amount of lanthanide metal ions added, and the
experimental data points are substituted into the linear expression
\(\frac{C_e}{q_e} = \frac{1}{K_L} + \frac{a_L}{K_L}C_e \)
. After
fitting the experimental data, the theoretical monolayer saturated capacity \(q_{max}\)(\(\frac{K_F}{a_L}\))
can be obtained.
The Freundlich isothermal equation is \(q_e = K_LC_e^{ \frac{1}{n} }\), (\(K_F\) is the Freundlich isothermal constant).
The same way is substituted into the logarithmic expression
\(lnq_e = lnK_F + \frac{1}{n}lnC_e\)
for fitting [5].
According to the R2 results of the relevant literature data, the engineered yeast displayed on the surface has a high degree of fitting with the Langmuir model, that is, the lanthanide metal ions form a monolayer adsorption on the surface of the engineered strain. In theory, the engineered yeast TFD protein is more evenly distributed on the cell surface after the surface display modification. Each protein binds to a lanthanide metal ion, and the interaction between each other is weak due to the low expression level, which conforms to the applicable conditions of the Langmuir model.
2.4 Conclusions
Since the commonly used elemental analysis methods ICP-MS or ICP-OES are expensive,
we had not been able to accurately characterize the modeling results of lanthanide adsorption
kinetics and lanthanide adsorption equilibrium. In addition, although our lanthanide
binding protein TFD has the characteristics of antenna effect to sensitize lanthanide
luminescence after combining rare earth ions (see our Measurement page for details),
in actual experiments, we found that the engineered yeast strain displaying TFD protein
on the surface was not enough to produce a significant fluorescence signal of lanthanide
metal ion adsorption change, which might be due to the inapplicability of the detection
method when applied to cell objects. However, in any case, our lanthanide adsorption
kinetics and adsorption equilibrium modeling process can be used as a reference for
the future.
Part 3: Biofilm Module: biofilm formation modeling
3.1 Overview
When the modified yeast itself can adsorb lanthanide metal ions, we hope to adhere it to a stable carrier to form a biofilm, and the formation process of the biofilm is affected by many factors. Therefore, it is necessary to establish a corresponding model to estimate the optimal nutrient addition amount, biofilm formation time, and membrane stability parameters, etc. Data provide theoretical support for the engineering practice of the entire biomining line. Among them, some of the modeling refers to the data from classical literature. Due to the limitation of time, instruments, and other resources, we have not been fully verified. Here, we only refer to some theories and formulas, and verify them using our experimental conclusions.
3.2 One-dimensional continuum model
Biofilm modeling began with a low-dimensional continuous description based on kinematics and translational diffusion. Later, a more complex microscopic kinetic mechanism was introduced, resulting in abnormal diffusion and dissipation of various components in the biofilm. Recently, discrete-continuous, multi-fluid, and single-fluid multi-component models have been used to study the coupling of biofilm and bulk fluid (or solvent), and the whole biofilm-bulk fluid system is regarded as a system composed of various components [9].
The development of biofilms is a multistage process. First, free-floating planktonic cells land on a damp surface and later get attached to it. After a sufficiently high cell density is reached, cells begin excreting an extracellular polymeric substance (EPS) and forming a EPS matrix anchored on the surface. The strain population inside the matrix grows as well by consuming nutritious substrates supplied by the surrounding environment. As a result, the biofilm begins to grow. Biofilm growth is a dynamical process whose structure is largely influenced by environmental conditions including the substrate diffusion limitation and the shear stress of the bulk fluid [9]. Here, we just use the one-dimensional continuum model proposed by Wanner and Gujer to predict the biofilm formation process [10].
- Model development is based on the following basic assumptions:
- 1. The total bulk liquid volume remains constant, while the biofilm volume increases.
- 2. Sugars and nitrogen are uniformly distributed in the liquid phase of the reactor.
- 3. Sugars and nitrogen diffuse from the bulk liquid to the biofilm.
- 4. Sugars and nitrogen are consumed only inside the biofilm.
- 5. Biofilm growth of fungus occurs uniformly on top of the liquid phase.
- 6. The biofilm porosity remains constant in time and space.
- 7. Detachment/attachment processes of fat-free biomass and lipids from/onto the surface of biofilm are assumed to be negligible.
- 8. Sugars are used for fat-free biomass growth and lipid synthesis.
- 9. Biofilm consists of two components: fat-free biomass and lipids. Thus, biofilm thickness increases due to the fat-free biomass growth and lipid accumulation.
- 10. The densities of fat-free biomass and lipids remain constant.
- 11. Lipid accumulation starts after nitrogen depletion in the reactor while cellular lipids turnover occurs after sugar consumption.
- Mass balance of sugar and nitrogen in large quantities of liquids:
- \(
V_L \frac{dS_B}{dt} = -AD_S\phi \frac{\partial S}{\partial z}|_{z=H} - AU_f \phi S|_{z=H} - AU_f(1-\phi)S|_{z=H} = -AD_S\phi \frac{\partial S}{\partial z}|_{z=H} - AU_f S|_{z=H} \)
- \(
V_L \frac{dN_B}{dt} = -AD_N\phi \frac{\partial N}{\partial z}|_{z=H} - AU_f \phi N|_{z=H} - AU_f(1-\phi)N|_{z=H} = -AD_S\phi \frac{\partial N}{\partial z}|_{z=H} - AU_f N|_{z=H} \)
Among them, \(S_B\) and \(N_B\) are the concentrations of sugar and nitrogen in
the bulk liquid phase (g/cm3 reactor liquid volume), \( V_L\) is the bulk liquid phase
volume (cm3), \( \phi \) is the porosity of the biofilm, \(S\) and \(N\) are the concentrations of
sugar and nitrogen in the biofilm (g/cm3 biofilm), \(D_S\) and \(D_N\) are the diffusion
coefficients of sugar and nitrogen (cm2/h), \(A\) is the total surface area of the
biofilm (cm2), \(U_f\) is the moving speed of the biofilm surface due to the growth of
the biofilm (cm/h), and \(H\) is the thickness of the biofilm (cm). It represents
the diffusion of sugar and nitrogen from bulk liquid to biofilm, respectively.
The second term considers the displacement effect of the biofilm surface in
space, and the third term describes the advection transport of dissolved
components caused by the displacement of the biofilm surface.
The mass balance equation simulates the development of fat-free biomass and lipids in biofilms in time and space:
\(
\frac{dX}{dt} = -U_f \frac{dX}{dz} + [(\mu_{SN} + \mu_L) X - \frac{X}{1-\phi} (\frac{\mu_{SN} + \mu_L}{\rho_X} X + \frac{q_L - \frac{1}{Y_{X/L} }\mu_L } {\rho_L} X)]
\)
\(
\frac{dL}{dt} = - U_f \frac{dL}{dz} + \left[ \left( q_L - \frac{1}{Y_{X/L} } \mu_L \right) X - \frac{L}{1-\phi} \left( \frac{\mu_{SN} + \mu_L}{\rho_X} X + \frac{q_L - \frac{1}{Y_{X/L} } \mu_L}{\rho_L} X \right) \right]
\)
Among them, \(Y_{X/L}\) is the fat-free biomass growth yield coefficient of
lipid (g fat-free biomass/g lipids), \(\rho_X \) is the cell density of fat-free
biomass (g fat-free biomass/cm3 fat-free biomass), and \(\rho_L \) is the cell density of
lipid (g lipid/cm3 lipids). The porosity \(\phi \) of the biofilm is assumed to be constant.
The changes of sugar and nitrogen concentration in biofilm were described by reaction-diffusion equation:
\(
\phi \frac{\partial S}{\partial t} = (1 - \phi) U_f \frac{\partial S}{\partial z} + D_S \phi \frac{\partial^2 S}{\partial z^2} - \frac{1}{Y_{X/S} } \mu_{SN} X - \frac{1}{Y_{L/S} } q_L X
\)
\(
\phi \frac{\partial N}{\partial t} = (1 - \phi) U_f \frac{\partial N}{\partial z} + D_N \phi \frac{\partial^2 N}{\partial z^2} - \frac{1}{Y_{X/N} } \mu_{SN} X
\)
Among them, \(X \) is the concentration of fat-free biomass in the biofilm (g/cm3
biofilm), \(Y_{L/S} \) (g fat-free biomass/g sugar), \(Y_{X/N} \) (g lipid/g sugar) and (g fat-free
biomass/g nitrogen) are the yield coefficients of fat-free biomass and lipid
produced on sugar and fat-free biomass produced on nitrogen, respectively,
\(q_L \) is the specific accumulation rate of lipid (g lipid/g fat-free biomass h),
and \( \mu_{SN} \) is the specific growth rate of fat-free biomass (1/h). In fact, the
effect of this process on the changes in sugar and nitrogen concentrations
in the biofilm is negligible, but to make the mass balance equation accurate,
the flux must be included. The second item simulates the diffusion transport
of sugar and nitrogen in the biofilm. The third term of the two equations
describes the consumption of sugar and nitrogen by the growth of fat-free
biomass in the biofilm. Finally, the last item in the equation describes
the consumption of sugar to form lipid accumulation.
The change of biofilm thickness can be obtained by the following formula:
\(
\frac{dH}{dt} = \int_{0}^{H} (\frac{\mu_{SN} + \mu_L}{(1-\phi)\rho_X} X + \frac{q_L - \frac{1}{Y_{X/L} }\mu_L}{(1-\phi)\rho_L} X ) dz
\)
where \(H\) is the total biofilm thickness (cm).
The simplified equations are substituted into the boundary conditions:
\(
\frac{\partial{S(0,t)} }{\partial{z} } = 0
\)
\( \frac{\partial{N(0,t)} }{\partial{z} } = 0 \)
\( S(H,t) = S_B \)
\( N(H,t) = N_B \)
\( \frac{\partial{N(0,t)} }{\partial{z} } = 0 \)
\( S(H,t) = S_B \)
\( N(H,t) = N_B \)
and the initial conditions are:
\(
S_B(0) = S_{in}
\)
\( N_B(0) = N_{in} \)
\( S(z,0) = 0 \)
\( N(z,0) = 0 \)
\( H(0) = H_{in} \)
\( N_B(0) = N_{in} \)
\( S(z,0) = 0 \)
\( N(z,0) = 0 \)
\( H(0) = H_{in} \)
where \(S_{in}\) , \(N_{in}\) are the initial concentrations in the reactor (g/cm3) and \(H_{in}\) is the initial biofilm thickness [11].
The model includes a more flexible description of the transport of dissolved components in the biofilm and considers the diffusion and transport of particle components in the biofilm solid matrix, the change of the liquid phase volume fraction (porosity) of the biofilm, and the simultaneous separation and attachment of cells and particles on the biofilm surface. This extended model can reproduce most of the new experimental data.
3.3 Other biofilm models
In addition to the one-dimensional continuum model, there are other models related to biofilm characteristics, such as DLA model for biofilm growth and pattern formation, discrete-continuum / cellular automata (CA) model, the biofilm model coupled with biomass and flow, and individual-based modeling (IbM).
Although the methods used are different, the basic processes are usually introduced in some way, such as attachment, microbial growth, nutrient absorption, cell death, extracellular product formation, separation, and some chemical processes, which are mainly supported by classical theories such as mass balance equation, Fick's law and Navier-Stokes equation. For example, in the individual-based model, microbial growth is introduced through cell division, and cell division has a set of rules to control the structural changes of the matrix after the introduction of daughter cells. On the other hand, in the model where biomass is regarded as a continuum, growth can be described by continuous biomass expansion and movement. These models can focus on describing biofilms at the scale of the entire colony or the level of a single cell, taking into account the details of the cell structure and how it affects cell behavior [12].
3.4 Conclusions
In the experimental verification of biofilm formation, we adhered the engineered
yeast to the surface of agar plate, fluidized bed carrier filler, polystyrene
petri dish plate, and other carriers for observation. The engineered yeast
overexpressing gene FLO11 was incubated with the fluidized bed carrier filler
in the medium, and it took about 3 days to form a stable and most biofilm, and
the biofilm was stained with ammonium oxalate-crystal violet staining solution
to observe the morphology under the microscope. See our Results page for details.
However, due to the lack of precise measuring instruments and ideal test
carriers, we only transferred the one-dimensional continuum model appropriately
and substituted it into the analysis of the physical and chemical processes
of biofilm formation and shedding to provide theoretical references for the
experimental conclusions. We look forward to further enriching the content
of the modeling part and establishing a model that is more suitable for the
characteristics of our engineered yeast when the experimental conditions allow.
References
- [1] Caldwell, Shane J., et al. Tight and specific lanthanide binding in a de novo TIM barrel with a large internal cavity designed by symmetric domain fusion. Proceedings of the National Academy of Sciences. 2020, 117(48): 30362-30369.
- [2] Martin, Langdon J., and Barbara Imperiali. The best and the brightest: Exploiting tryptophan-sensitized Tb 3+ luminescence to engineer lanthanide-binding tags. Peptide libraries: methods and protocols. 2015: 201-220.
- [3] Tutorials and Webinars — GROMACS webpage https://www.gromacs.org documentation
- [4] Robert, Xavier, and Patrice Gouet. Deciphering key features in protein structures with the new ENDscript server. Nucleic acids research. 2014, 42(W1): W320-W324.
- [5] Xie, Xiaoman, et al. Broad-spectrum and effective rare earth enriching via Lanmodulin-displayed Yarrowia lipolytica. Journal of Hazardous Materials. 2022, 438: 129561.
- [6] Ucun, Handan, Yalcin Kemal Bayhan, and Yusuf Kaya. Kinetic and thermodynamic studies of the biosorption of Cr (VI) by Pinus sylvestris Linn. Journal of Hazardous Materials. 2008, 153(1-2): 52-59.
- [7] Xie, Xiaoman, et al. Effectively auto-regulated adsorption and recovery of rare earth elements via an engineered E. coli. Journal of Hazardous Materials. 2022, 424: 127642.
- [8] Al-Ghouti, Mohammad A., and Dana A. Da'ana. Guidelines for the use and interpretation of adsorption isotherm models: A review. Journal of hazardous materials. 2020, 393: 122383.
- [9] Wang, Qi, and Tianyu Zhang. Review of mathematical models for biofilms. Solid State Communications. 2010, 150(21-22): 1009-1022.
- [10] Wanner, Oskar, and Peter Reichert. Mathematical modeling of mixed-culture biofilms. Biotechnology and bioengineering. 1996, 49(2): 172-184.
- [11] Economou, Ch N., et al. Modeling of oleaginous fungal biofilm developed on semi-solid media. Bioresource technology. 2011, 102(20): 9697-9704
- [12] Dzianach, Paulina A., et al. Challenges of biofilm control and utilization: lessons from mathematical modelling. Journal of the Royal Society Interface. 2019, 16(155): 20190042.