Calcium Dynamic Model

Overview

Maintaining a delicate homeostasis of ion concentration is critical for cell survival, the concentration of different ions inside a cell can give insight into the current state of the cell, and we wish to explore the changes in dynamic ion concentration in a cell through mathematical modeling to understand calcium ion homeostasis.

Calcium ion is found to be important in maintaining cellular functions. We noticed that there are many literatures that mention cancer and calcium ions. They encouraged us to further investigate the relationship between calcium ion concentration and cancer, or particularly, tumor cells.

Reviewing current articles/studies

Our finding showed that there is a large interest in modeling calcium dynamics, while we grasp the idea that most articles use ordinary differential equations (ODEs), understanding how calcium ions are taken up, stored, and depleted is crucial for constructing our model. Modeling biological systems with ODEs is a common technique in describing the dynamic of a system; this includes populations, ion exchanges, etc. The articles we reviewed describe the importance of calcium oscillations. Calcium oscillations are controlled under different physiological conditions such that different cell types exhibit different types of calcium oscillations (Rahmani et al., 2024). Different articles that we studied used different cell types to build their model, which showed different modes of peak, time intervals for each peak, maximum (the peak) and minimum calcium concentration during the oscillation, etc. (Futagi & Kitano, 2015; Han et al., 2017; Han & Periwal, 2019; Mirzakhalili et al., 2018; Sobie et al., 2002; Szopa et al., 2013; Wacquier et al., 2016; Zhou et al., 2020) .

On the other hand, the cell takes up calcium ions normally by the control of certain protein channels or by facilitation of a process called store-operated Ca2+ entry (SOCE). However, the majority of calcium content is stored in the ER (Zheng et al., 2023)(Figure 1,2). The protein channels of the Orai family and SERCA are involved in the SOCE process, influenced by the signalling molecule, such as IP3, increase calcium intake under low cytosolic concentration and are stored in the ER (Hodeify et al., 2018). ER regulates cytosolic calcium concentration through a process called calcium-induced calcium release (CICR), where the calcium ion leaves the ER through IP3R and RyRs, where the calcium ion can activate Ca-activated Cl channels (CaCC). Furthermore, if a sudden accumulation of calcium ions in the ER space occurs, it triggers the release of store overload-induced calcium (SOICR), which helps regulate the calcium concentration in the ER. In addition to storing calcium in the ER, the cytosolic calcium ion would bind with calmodulin to act as a second messenger and activate downstream pathways, such as HNOS. It is clear that ER is important in calcium homeostasis, whereas mitochondria are important in cell survival.

Lastly, as we learn that cytosolic calcium concentration is tightly regulated under normal physiological conditions, our aim is to identify the cytosolic calcium ion concentration that can lead to apoptosis. We understand that the pathway by which calcium ion overload induces apoptosis involved the activation of various signaling cascades, e.g. the opening of mitochondrial permeability transition pore (MPTP). Therefore, we want to build a model so that we can observe the relationship between calcium ions and other ions inside the cell.

Mitochondrial calcium overload is a well-known trigger for the opening of MPTP, which is a large and nonspecific pore in the inner mitochondrial membrane. When MPTP opens, it causes mitochondrial depolarization and leads to loss of ATP production and releasing of pro-apoptotic factors such as cytochrome c into the cytosol, which consequently initiates the apoptotic cascade, and programmed cell death occurs.

Hence, we designed the following goals for modeling calcium dynamic:

1.Simulate the oscillation of calcium ions in cell

2.Understand the dynamic of other ions, e.g. NADH, IP3, and Adenine

3.Induce certain conditions and observe the ion dynamics so that we may describe a cell-death process

Figure 1.1: Overview of calcium channels Source：Tajada & Villalobos, 2020

Figure 1.2: Diagram of SOCE, CICR, and SOICR Source: Zheng et al., 2023

Construction

Background

With our knowledge of calcium dynamics, we have integrated four models to simulate the processes we needed, in hopes to explain the phase relationship of Ca2+ changes between the cytosol, endoplasmic reticulum, and mitochondria. We proposed few assumptions about the calcium model:

1.Calcium influxes and effluxes of plasma membrane channels govern the cytosolic calcium concentration

2.ER is considered the primary intracellular calcium store, to include the CICR mechanism and the feedback inhibition f IP3R and RYRs at high calcium concentrations

3.High mitochondrial calcium concentrations opens the MPTP, which then causes the cell to die (through apoptosis)

4.High mitochondrial calcium concentrations opens the MPTP, which then causes the cell to die (through apoptosis)

The model tries to consider external factors that influence calcium ion flux, yet due to the lack of detailed modeling processes for Mscs and TRPC1 channels, we assumed their influxes as Jm and attempted to simulate the Jm that leads to calcium ion overload. We developed a comprehensive mathematical model that integrates four distinct models (Han & Periwal, 2019; Szopa et al., 2013; Wacquier et al., 2016; Zhou et al., 2020), the model describes different aspects of calcium dynamics between the cytosol, ER, and mitochondria:

Key parameters and values

The values of parameter are presented below for reference:

Equations

1.For total Calcium concentration：

The total calcium concentration is govern by the calcium influx (Jin) and the calcium efflux(Jpm). The calcium influx (Jin) is the sum of several components, including leak influx through the plasma membrane, receptor-operated calcium channel (ROCC) influx, store-operated calcium channel (SOCC) influx, and the influx through mechanosensitive channels (Msc) and transient receptor potential canonical (TRPC1) channels （see eq.3-5); The efflux of calcium from the cytosol is mediated by the PMCA pumps, which follow a Michaelis-Menten-like kinetics.

2.For cytosolic IP3 concentration:

The cytosolic IP3 concentration is determined by the balance between its production and degradation. The production of IP3 depends on the activity of phospholipase C (PLC), which is regulated by cytosolic calcium concentration( see eq.6), and the rate of PLC activity is given by eq.7 and eq.8:

Note: The cytosol and mitochondria both contain calcium buffers, which bind free calcium ions. The dynamics of calcium buffering are described by differential equations that take into account the binding and unbinding of calcium to these buffers. This buffering effect modulates the free calcium concentration in the cytosol and mitochondria, influencing the overall calcium dynamics.

3.For Calcium concentration in cytosol,

The calcium concentration in the cytosol take into the account of extracellular source, ER, and mitochondria. Influx of calcium include the cell take calcium ion from extracellular space, ER releasing vesicles of calcium ions, and calcium ion from mitochondria through NCX channel. The efflux of cytosolic calcium concentration is primarily governed by plasma membrane permeability for calcium ion and through SERCA channels.

4.For Calcium concentration in ER,

We describe the fluxes of ER mainly with three channels: SERCA, IP3R and RYR channels( which is combined into a single flux named Jch). The ratio of volume between the cytoplasmic space and the ER space is considered. The SERCA channel on the ER membrane requires ATP to transport calcium from the cytosol, and Jch regulates the calcium concentration if there is high level of calcium concentration.

5.For Calcium concentration in mitochondria,

The calcium dynamic in mitochondria is modeled as the following(see eq.15-25), in which the calcium fluxes of mitochondria are governed by three channels: MCU, NCX, and other channels (note as Jx), The mitochondrial calcium uptake through the MCU is described by a complex function that depends on the cytosolic calcium concentration and mitochondrial membrane potential.The mitochondrial metabolic pathways, including the production of ATP and oxidation of NADH, are closely linked to calcium dynamics. The rate in which NADH production through the pyruvate dehydrogenase complex (PDH) is given by eq.20.

The oxidation of NADH by the electron transport chain (ETC) is modeled as:

The production of ATP by the F1FO ATPase is given by:

The transport of ATP and ADP across the mitochondrial membrane through the adenine nucleotide translocator (ANT) is described by:

6.For NADH and NAD concentration in mitochondria

The dynamics of NADH concentration in the mitochondria are governed by the balance between its production and consumption, we first had to assume that the total of NADH and NAD in the mitochondria is constant.

7.For ADPm, ADPc, ATPm, and ATPc concentration

The notation of “m” and “c” state where the adenine is at( “m”: mitochondria, “c”: cytosolic). Similarly, the total concentrations of ADP and ATP in the mitochondria and cytosol are assumed to be constant. Then, considering that NADH and ATP are critical for maintaining cellular energy homeostasis. The production of NADH through the TCA cycle is regulated by calcium and is described by eq.32.

One of the key insights from this model is the critical role of mitochondrial calcium overload in triggering apoptosis. When the MCU takes up excessive calcium, it can lead to the opening of the mitochondrial permeability transition pore (MPTP), causing mitochondrial depolarization, loss of ATP production, and release of pro-apoptotic factors like cytochrome c.

Result Discussion

During the process of constructing our model, we encounter many difficulties such as the complexity of the model and finding a suitable programming language to simulate the model. We essentially used Python to code our model for its simplicity in syntax, the next thing was to write the code for our model and simulate the result by using the visualization library matplotlib. We used the ODE solver function from SciPy library and obtained our initial simulation (Figure 1.3):

Note: We are aware that Vm is a different unit from the other parameters shown in the result, we only put them together to avoid plotting many graphs.

Initial simulation of model

Figure 1.3 Result of model using Scipy.integrate.odeint

Our first simulation was very abnormal, specifically the calcium concentration of mitochondria in the figure was returning negative values at certain time point. It meant that either mistakes occurred in our model or that our choices of parameters were not suitable. We then simulate again after adjusting on the initial concentration:

Second simulation of model

Figure 1.4: second simulation of model using Scipy.integrate.odeint

Figure 1.5: Zoomed in on Figure 1.4

This simulation was better than the initial simulation, however the calcium concentration of mitochondria increased dramatically (shown in Figure 1.5), and that all concentration would return zero at 0.2s. We expect that the model could simulate in the time range until 120s, and so we had to continue debugging and slightly adjusting our parameters for every simulation. As there are more than 50 parameters, the complexity of the model made it hard for us to decide on values for each parameter, instead, we refer to the values that were used in the models we referenced and integrated. We have also considered using SciPy library previously to solve the model, although for its precision and run time speed, we decided to run our model by using a self-defined function that utilizes the principle of the Euler method. By self-defining an Euler function (an ODE solver), we can customize our ODE solver function such that our result can catch negative values and return zero instead (except for Vm). Finally, we have a result that resembles certain features:

Latest simulation of model

Figure 1.6 Latest simulation of model using self defined euler function

By looking at the upper graph of Figure 1.6, we can see that the calcium concentration of mitochondria stays at a low and steady level, while calcium concentration of cytoplasm and of ER seem to reach a “homeostasis” stage, though this value is not normal for calcium cytosolic concentration. Another interesting discovery is the NADH concentration shown in bottom graph of Figure 1.6, surprisingly showing a pattern of oscillation, while the membrane potential in the graph doesn’t seem to oscillate that much. With further investigation of the lastest simulation result, we tried inputting different values for the initial conditions and we can conclude the following:

1.The calcium concentration of mitochondria is highly influenced by the membrane potential, NADH, and ADPm

2.IP3 concentration can influence the cytosolic calcium concentration

3.ER calcium concentration is influenced by ADPc

The above observations help us to understand how the model works with each values chosen for initial concentration/condition, and that it also is approximately consistent with the current understanding of calcium dynamics.

Future improvements

From our attempts to simulate our model and observe the behavior of each concentration, we learn a lot about calcium dynamic yet the model is still not able to simulate to normal physiological range. We hope to improve and validate our current model by combining data from wet-lab, such that we can identify discrepancies and adjust our model accordingly. Next, we did not introduce parameter constraints as we had mentioned the purpose for self-defining an Euler function, introducing an error-handling mechanism that can catch and correct unrealistic values that may return during our future simulation. Furthermore, we proposed that reviewing and refining the ODEs in our model in hopes to accurately represent the calcium dynamic process. Finally, conducting a sensitivity analysis on our parameter may help to identify which parameters have the most significant impact on our model’s output, once identified and considered that the parameter needs to adjust, we can focus on refining the most influential parameters that affect our model’s simulation.

Protein Molecular Model

According to the engineering cycle, we proposed to add the iRGD sequence to increase the specificity of exosomes in targeting cancer cells by interacting with integrin αvβ3 (a cancer marker for breast, lung, and pancreatic carcinomas). Here, by using computer protein modeling, we predicted the structure of LAMP2B-GGGGS-iRGD/LAMP2A-GGGGS-iRGD, performed molecular docking with the integrin αvβ3 (PDB ID: 4G1M), calculated the corresponding indices and performed a redesign of the iRGD, and improved the affinity of the iRGD-integrin complexes.

Introduction

We performed a comprehensive molecular modeling analysis to investigate the interactions between the iRGD peptide chain and the integrin αvβ3 (4G1M) present on the surface of tumor cells. Using molecular docking and associated computational methods, our aim was to predict the structural characteristics of the iRGD peptide and its affinity towards 4G1M.

To evaluate the practical applicability of the iRGD peptide, we focused on its aggregation tendencies and stability. Using the original peptide sequence as a reference, we employed advanced molecular modeling techniques to identify key amino acid residues involved in binding. Through virtual mutagenesis of these residues, we aimed to enhance the binding affinity of the iRGD peptide, thereby improving its targeting efficacy for tumor cells. Furthermore, we evaluated the structural interactions between the iRGD peptide and the integrin αvβ3 (4G1M) through molecular modeling. This analysis informed our virtual mutation strategy, which enhanced the potential of the peptide to effectively direct exosomes toward tumor cells for therapeutic applications.

In general, the protein molecular modeling process encompasses the structural prediction of LAMP2B-GGGGS-iRGD and LAMP2A-GGGGS-iRGD and molecular docking with 4G1M, followed by protein modification based on their complex structure.

Structure Prediction

Background

As the structures of LAMP2B-GGGGS-iRGD and LAMP2A-GGGGS-iRGD were unknown, our first step was to predict the structures of these two fusion proteins. We used the advanced deep learning model AlphaFold2 (AF2), which has shown exceptional performance in predicting single-chain protein structures. We use UCSF ChimeraX as a molecular visualization tool and utilize ColabFold for new protein predictions.

Methodology

We obtained the amino acid sequences of the LAMP2B-GGGGS-iRGD and LAMP2A-GGGGS-iRGD proteins from FASTA files. After opening the AlphaFold panel in ChimeraX, we copied the amino acid sequences of both proteins into the designated boxes separately.

In the options, we selected "Energy minimize predicted structures," "Trim fetched structures to aligned sequence," and "Use PDB templates when predicting structures." We clicked the predict button to start the execution for both proteins. Google's ColabFold computing environment was used to predict the 3D structures of the proteins.

Results

After analyzing the prediction results, we download the best predicted structures for LAMP2B-GGGGS-iRGD and LAMP2A-GGGGS-iRGD (Figures 2.1A, 2.1B). We evaluated the reliability of each segment of the predicted structures by analyzing the additional images generated (Figures 2.2-2.4).

AlphaFold provided expected position error values for each pair of residues (XY), showing the predicted position error when residue X is aligned with residue Y in the true structure. These residue "predicted alignment error" values can be visualized through error plots (Figure 2.2).

Furthermore, we generated sequence coverage graphs to examine the number of similar sequences found at different positions in LAMP2B-GGGGS-iRGD and LAMP2A-GGGGS-iRGD (Figure 2.3).

The predicted structures include atomic coordinates, and confidence estimates for each residue, with scores ranging from 0 to 100. Higher scores indicate higher confidence. This confidence measure, called pLDDT, corresponds to the per residue score predicted by the model in the IDDT-Ca metric (Figure 2.4).

Figure 2.1A Predicted structure for LAMP2B-GGGGS-iRGD

Figure 2.1B Predicted structure for LAMP2A-GGGGS-iRGD

Figure 2.2 Predicted IDDT per position plot for LAMP2B-GGGGS-iRGD(left) and LAMP2A-GGGGS-iRGD(right)

Figure 2.3 Sequence coverage plot for LAMP2B-GGGGS-iRGD(left) and LAMP2A-GGGGS-iRGD(right)

Figure 2.4 Predicted aligned error plots (5 ranks) for LAMP2B-GGGGS-iRGD(left) and LAMP2A-GGGGS-iRGD(right)

Comparative Analysis

1.Predicted IDDT per position:

Both proteins show similar general patterns with high IDDT scores (>80) in most regions, indicating good confidence in the predictions. Both exhibit dips in scores around positions 200 and 380-400, suggesting less reliable predictions in these areas. The N-terminal regions (first 50 residues) of both proteins show lower confidence scores.

2.Sequence coverage:

Both LAMP2B and LAMP2A show a clear transition around position 200, with higher sequence coverage and diversity in the N-terminal half. The C-terminal regions (after position 200) of both proteins show lower sequence coverage but higher sequence identity to the query. This pattern is consistent across both proteins, suggesting similar domain organizations.

3.Predicted aligned error:

Both proteins show two distinct domains (approximately residues 1-200 and 201-400).

The interdomain region (around residue 200) shows a higher predicted error in both cases, consistent with the dips observed in the IDDT plots.

LAMP2B-GGGGS-iRGD shows slightly higher predicted alignment errors in the C-terminal region compared to LAMP2A-GGGGS-iRGD.

4.Overall structure comparison:

a.The overall patterns in sequence coverage and predicted aligned error are very similar between LAMP2B-GGGGS-iRGD and LAMP2A-GGGGS-iRGD.

b.Both proteins likely have similar domain structures and overall folds.

c.LAMP2B exhibits slightly greater flexibility in certain loop regions, particularly in the C-terminal area.

d.LAMP2A shows marginally higher average pLDDT scores, especially in the C-terminal region, suggesting that its overall structural prediction might be slightly more reliable.

These analyzes provide information on the predicted structural features of LAMP2B-GGGGS-iRGD and LAMP2A-GGGGS-iRGD, highlighting their similarities and subtle differences. Although both proteins appear to have comparable overall folds, small variations in flexibility and prediction confidence could influence their specific interactions with target molecules. This information can guide further experimental studies or functional predictions for these proteins, particularly in their potential roles in drug-targeting and cell-targeting applications.

Molecular docking

Background

We utilized molecular docking to explore the interaction between the integrin and iRGD. ZDOCK, a protein interaction docking program that employs a fast Fourier transform, is specifically designed to investigate all possible binding modes by translating and rotating two proteins in a three-dimensional space. Each binding model is then evaluated using an energy-based scoring function. In ZDOCK version 3.0.2, the scoring incorporates IFACE statistical potential energy, structural complementarity, and electrostatic interactions. We applied ZDOCK 3.0.2 to dock the 4G1M and LAMP2B-GGGGS-iRGD or LAMP2A-GGGGS-iRGD proteins, selecting the most optimal docking result upon completion. For visualization and labeling of the binding sites in the docking complex, we used PyMol version 2.4.0.

Methodology

We began by searching for the 4G1M protein structure in the PDB database and subsequently downloaded it. The LAMP2B-GGGGS-iRGD or LAMP2A-GGGGS-iRGD model generated in the previous step was then optimally prepared. Both protein structures were uploaded to the Discovery Studio software, where we performed the ZDOCK 3.0.2 prediction. Following this, we used PyMol V3.0.4 to analyze the docking results. After completing the molecular docking process, we accessed the best-predicted complex structure file in PyMol. To remove all solvent molecules and display the stick representations of residues within 5 angstroms of either iRGD or 4G1M, we executed the following commands:

PyMOL>remove solvent

PyMOL>show sticks, byres 4G1M within 5 of iRGD

PyMOL>show sticks, byres iRGD within 5 of 4G1M

Subsequently, we identified residues that form contact bonds, which were crucial for labeling and visualizing the binding sites of the docking complex. We focus on amino acids capable of establishing hydrogen bonds on the interaction surface.

PyMOL>select contact_residues, byres (4G1M within 5 of iRGD) or (iRGD within 5 of 4G1M)

To visualize the contact residues, we selected polar contacts on the Pose4G1M layer and hid both the iRGD and the 4G1M structures to focus on the visible bonds. After confirming all residues involved in bond formation, we renamed the selection in the (sele) layer to contact_bond_residues.

After completion of this analysis, we uploaded the docking complexes to the PDBePISA website for further investigation of the interaction domains and surfaces of the involved proteins.

Results

Amino acids capable of forming hydrogen bonds within the bonding molecules are prominently illustrated in Figure 2.5 & 2.6. In this depiction, the colored sticks represent the amino acid residues that can establish contact bonds, highlighting the specific interactions that contribute to the stability of the docking complex. Each capital letter corresponds to the standard one-letter abbreviation of the respective amino acid, while the accompanying numbers indicate their precise positions within the protein sequence.

To further investigate the nature of the interactions between the two proteins, we uploaded the best-predicted protein structure to the PDBePISA website. This platform allows for a comprehensive analysis of protein-protein interactions, enabling us to assess the interaction domains and surfaces involved. Following our analysis, we download a detailed summary of the interaction surface, which is presented in Table 1.

Figure 2.5A The docking of LAMP2B-GGGGS-iRGD with 4G1M

Figure 2.5B The docking of LAMP2B-GGGGS-iRGD with 4G1M

Figure 2.6A The docking of LAMP2A-GGGGS-iRGD with 4G1M

Figure 2.6B The docking of LAMP2A-GGGGS-iRGD with 4G1M

Note:(A) Results of molecular docking, LAMP2B-GGGGS-iRGD/LAMP2A-GGGGS-iRGD is shown in light pink and 4G1M is shown in palecyan. (B) Schematic representation of the molecular docking surface. Among them, amino acids that can form hydrogen bonds are shown.

Table 1. Summary of docking complex interface of LAMP2B-GGGGS-iRGD(left) and LAMP2A-GGGGS-iRGD(right)

Developmental analysis of the proteins

Background

High concentrations of the iRGD peptide can lead to aggregation, which reduces not only their functional activity but also their ability to trigger unwanted immune responses. To address this problem, the calculation of protein aggregation trends provides an important metric to assess the tendency of protein surface amino acids to aggregate. By identifying specific regions on the protein surface that are prone to aggregation, we can implement targeted mutations to engineer proteins with enhanced stability. Furthermore, the aggregation behavior of iRGD proteins is a crucial factor that influences their development ability, making it essential to understand and mitigate this issue for successful therapeutic applications.

Methodology

We began by using the "apply forcefield" command within the Change Forcefield tools to assign a CHARMM forcefield to the LAMP2B-GGGGS-iRGD/LAMP2A-GGGGS-iRGD protein structure. This step is crucial, as the CHARMM force field provides a set of parameters that accurately reflect the physical and chemical properties of the protein, allowing for more reliable simulations and analyses.

Following this, we calculated the solvent-accessible surface area (SASA) for the side chains of each residue using Discovery Studio. SASA is an important metric that helps us understand how much of the protein's surface is accessible to solvent, which can be indicative of the protein's stability and interaction potential.

In the Tools Explorer, we navigated to the Macromolecules and Predict Protein Aggregation Tools panel, where we clicked on "Calculate Aggregation Scores." This feature allows us to assess the likelihood of protein aggregation, an important factor in protein stability and functionality. When the parameter settings appeared, we selected the proteins we wished to analyze and adjusted the Cutoff Radius to 5, 7, and 10 Å.

The per-atom aggregation propensity score is calculated as the ratio of the actual side-chain SASA to the SASA of fully exposed side-chain atoms of the same type. This ratio is then multiplied by the hydrophobicity of the residue, taking into account all residues within the specified Cutoff Radius of each atom. This calculation provides detailed information on the aggregation tendencies of the protein, helping us identify regions that may require further optimization or modification to enhance stability.

Results

The color coding in our analysis visually represents the protein aggregation trend score for a Cutoff Radius of 10 Å. In this scheme, atoms exhibiting a high aggregation trend score are highlighted in red, indicating a greater propensity for aggregation, whereas those with a low score are depicted in blue, suggesting a lower likelihood of aggregation (Figure 2.9). This visual differentiation allows for quick identification of critical regions within the protein that may influence its stability and function.

Furthermore, to enhance the data representation, both a line chart and a point chart are automatically generated (Figures 2.7 and 2.8).

The amino acid sequence number is plotted along the horizontal axis, while the corresponding protein aggregation trend scores are represented on the vertical axis. This format provides a clear view of how the aggregation propensity varies across the sequence, allowing for easy identification of peaks and troughs in the aggregation tendency.

In the point graph, the sequence numbers of the identified protein aggregation sites are again plotted on the horizontal axis, and the aggregation trend scores are displayed on the vertical axis. This chart format provides a more granular view of specific aggregation sites, helping to pinpoint areas that may be targeted for further investigation or modification.

LAMP2B-GGGGS-iRGD:

Figure 2.7 Protein aggregation propensity scores calculation of LAMP2B-GGGGS-iRGD(left) and LAMP2A-GGGGS-iRGD(right) using CHARMm

Figure 2.8 Protein aggregation propensity scores calculation of LAMP2B-GGGGS-iRGD(left) and LAMP2A-GGGGS-iRGD(right) using CHARMm

Figure 2.9A Protein aggregation propensity scores calculation of LAMP2B-GGGGS-iRGD using CHARMm

Figure 2.9B Protein aggregation propensity scores calculation of LAMP2A-GGGGS-iRGD using CHARMm

Redesign of protein molecules

Virtual amino acid mutations increase the binding affinity of protein

Background

In the design of the iRGD peptide, specific amino acid mutations are commonly used to enhance binding affinity. However, traditional selection of amino acids can be inefficient as a result of their often blind nature. To enhance the binding affinity of the iRGD peptide, virtual amino acid mutations can be conducted on iRGD-integrin complexes using Discovery Studio software. This approach allows for the systematic evaluation of potential mutations, guiding experimental efforts by identifying optimal amino acid combinations that improve interaction strength and specificity.

Methodology

First, we uploaded the structural file resulting from the molecular docking analysis. The protein complex was then subjected to the CHARMM force field to ensure accurate energy calculations. Next, we selected amino acid residues within a 3 Å radius around the ligand for further analysis. By opting for the single mutation mode, we mutated all previously selected amino acids to alanine (ALA) to evaluate how these mutations affected binding affinity.

Using the MODELER program, we optimized the mutated residues along with any surrounding residues within the specified radius. This process allowed us to compute the mutational energy, allowing us to identify key amino acid residues that significantly influence protein affinity. Subsequently, these critical residues were mutated to the remaining 19 standard amino acids. The resulting mutations were ranked in order of their mutational energy, from lowest to highest, to determine their potential impact on binding.

Results

When sorting the results of the single mutations based on mutational energy, we showed only the five mutations with the highest and lowest energies. The mutations with the highest energies included GLY2, GLY1, PHE152, ARG153 and SER111 (for LAMP2B-GGGGS-iRGD) and GLY360, ARG2, GLY3, LYS62 and PRO7 (for LAMP2A-GGGGS-iRGD), which were identified as key amino acids crucial for protein interaction (Figure 2.10, Table 2).).

Figure 2.10 Mutation energy (Binding) analysis in single mutation of LAMP2B-GGGGS-iRGD(left) and LAMP2A-GGGGS-iRGD(right)

Note: Amino acid residues within 3 Å around the 4G1M were selected. Stabilizing mutation energy is less than -0.5 kcal/mol; neutral mutation energy is between -0.5 to 0.5 kcal/mol; destabilizing mutation energy is greater than 0.5 kcal/mol.

Table 2. Calculation of mutation energy (Binding) of 4G1M and LAMP2B-GGGGS-iRGD(left) and LAMP2A-GGGGS-iRGD(right)

Note: The table in the Summary section of the report lists up to ten mutations with the five lowest energy mutations and five highest energy mutations. The ranking of the mutations was sorted from lowest to highest energy mutations.

To gain deeper insight into how mutations can enhance protein affinity, we conducted saturation mutagenesis on the five amino acids identified earlier. The results (Figure 2.11, Table 3) indicated that the amino acid sites GLY3 and CYS1 are particularly important for the affinity of LAMP2B-GGGGS-iRGD and LAMP2A-GGGGS-iRGD, respectively. In particular, when GLY3 was mutated to CYS, PHE, TRP, LYS, or PRO, a significant improvement in the affinity of LAMP2B-GGGGS-iRGD was observed. Similarly, when CYS1 was mutated to GLY, ALA, CYS or SER, a significant improvement in the affinity of LAMP2A-GGGGS-iRGD was observed.

Figure 2.11 Mutation energy (Binding) analysis in single mutation of key amino acids in LAMP2B-GGGGS-iRGD(left) and LAMP2A-GGGGS-iRGD(right)

Table 3. Calculation of mutation energy (Binding) of key amino acids in 4G1M and LAMP2B-GGGGS-iRGD(left) and LAMP2A-GGGGS-iRGD(right)

Lentivirus transfection model

Background

Traditional approaches to studying lentiviral transfection have often relied on empirical methods and qualitative assessments. However, as the field advances, there is an increasing need for quantitative models that can accurately predict and describe the various stages of the transfection process. Mathematical modeling offers a powerful means to achieve this goal, providing insights into the intricate dynamics of viral-cell interactions and gene transfer efficiency.

Lentivirus Production

The production of lentiviral vectors begins with seeding of HEK293T packaging cells in a DMEM medium. A DNA-PEI mixture was prepared in OptiPro SFM (Serum free medium) and used to replace the culture medium, along with chloroquine, to enhance the transfection efficiency. After 18 hours, the medium is replaced with fresh DMEM or OptiPro SFM. The viral supernatant is harvested 48, 72, and 96 hours after transfection, followed by centrifugation and filtration to remove cellular debris. The purified viral particles are then aliquoted and stored at -80°C for future use.

Target Cell Infection and Stable Cell Line Generation

Before infection, the optimal dosage of antibiotics for selection is determined. Target cells are seeded in culture plates, and thawed lentiviral aliquots are added along with polybrene to enhance infection efficiency. Cells are incubated with the virus for 48-72 hours, after which the medium is replaced with fresh media containing the selection antibiotic. Selection pressure is maintained with regular changes in the medium and cell growth is closely monitored. Surviving cells are expanded and cell stocks are prepared for protein expression testing. For long-term stability, the generation of monoclonal lines may be considered.

Infection Process

The lentiviral infection process involves several key steps:

1.Cell Infection: Lentiviruses can infect a broad spectrum of both dividing and quiescent cells, making them versatile tools for gene delivery.

2.Viral Entry: The process begins with the binding of viral envelope proteins to specific cell surface receptors, facilitating entry into the host cell.

3.Reverse Transcription: Once inside the cell, the viral RNA genome is reverse transcribed into a complementary DNA (cDNA) copy by the viral reverse transcriptase enzyme.

4.Genomic Integration: The lentiviral cDNA is then stably integrated into the hostcell genome, a crucial step that allows for long-term expression of the exogenous gene.

5.Gene Expression:Following integration, the host cell's transcription and6translation machinery recognizes and processes the integrated cDNA, leading tothe expression of the exogenous gene in the host cell.

The mathematical model of the process

Key Parameters

Actual Data

1.Growth Rate of HEK293T Cells: Estimated at 0.03 to 0.04 per hour, with an average doubling time of approximately 30 hours.

2.Binding Rate Constant: High lentiviral titers (up to 108 IFU/ml) were achieved using a specialized cell line and packaging system.

3.Extra-Cellular Decay Rate Constant: Calculated using a half-life of 8-9 hours, yielding a decay rate constant between 0.077 and 0.0866 per hour.

4.Intracellular Decay Rate Constant: Assumed to be 0.0012 h⁻¹, reflecting stable intermediate formation in the cytoplasm.

5.Nuclear Import Rate Constant: Estimated for HEK293T cells at 0.029 - 0.15 per hour, influenced by various cellular factors.

6.Cytoplasmic Trafficking Time: Estimated at 11.2 hours, considering the onset of viral genome expression.

Equations

1.Extracellular transport breakdown and binding

It is assumed that there is no hindered diffusion or hindered convection, where the rate of change in the concentration of the active retroviral vector is Vm = Vm(t,z) where r is time, z is the height above the target cell, and Kde is the decay rate constant of the retroviral vector in the extracellular medium Dv, is their diffusion coefficient, and u is their velocity due to an external force. At the upper surface of the medium (z=h), the net flux of retroviral vectors must be zero, that is,

Spreading to the surface of the virus:

where Ct is the total target cell density on the culture surface, which is obtained as the sum of all the cell populations. A retrovirus that transfers to the vicinity of a cell can bind nonspecifically to the cell surface through a reversible physical bond until, by surface diffusion, it reaches a specific receptor where it binds irreversibly. Hence, Kb represents the overall rate constant for irreversible binding to the cell-surface receptor. Equation (3) assumes that there is a sufficiently high level of receptor expression in the cells such that, at least for the moderate values of viral titer considered here, the number of receptors is not limiting.

The time-dependent population balance of target cells that remain free of viruses is given by

where Cf is the concentration of virus-free cells on the culture surface, Cc the concentration of virus-carrying cells that have at least one virus in their cytoplasm, kdi the rate constant for intracellular degradation of retroviral vectors, δ1 the fraction of virus-carrying cells that become virus free cells due to degradation of the viral genome, δ2 the fraction of virus-carrying cells that become virus-free during cell division (since all the internalized viruses in cytoplasm may be retained by one daughter cell during division), and is growth rate of the target cells

2.Internalization and Intracellular Trafficking

Since the release of the viral core into the cytoplasm is not a rate-limiting step, the rate of change of the viral vector in the cytoplasm of the entire cell population is

The population balance for the corresponding target cells which carry at least one viral genome inside their cytoplasm:

Poisson distribution (multiple infections)

The fraction of virus-carrying cells that contain k vectors inside their cytoplasm is therefore calculated as:

The average number of viruses inside the viruscarrying cells, nv

Vf represents the concentration of non-integrated viruses inside the virus-carrying cells only

(Considering both the nuclear import of viral vectors ready for integration and the depletion of viral vectors retained within the cell.)

The parameter 1 represents the fraction of virus-carrying cells with only one virus inside their cytoplasm, from equation (7)

During the division of a cell that contains k viruses inside its cytoplasm, the probability that one of the daughter cells becomes a virus-free cell is given by

2(the probability of becoming virus-free):

3.Nuclear Import and Integration

Assuming that integration is not affected by the number of viral vectors imported into the nucleus, the rate of change in the concentration of viral genomes integrated into the cell chromosome, Vi, is given by

(The integrated retrovirus is replicated during subsequent cell cycles)

The corresponding population balance of target cells with at least one integrated virus in their chromosome, Ci, is

The total target cell concentration, Ct, is

4.Physical Properties of Retroviral Vectors

5.MOI

Calculate the Transduction Units at Infection. Lentiviral titer is measured as Transduction Units per ml (TU/ml). One TU produces one integration event in target cells. To calculate the viral titer, it is first necessary to determine the number of Transduction Units (TU) used to infect the cells. When the percentage of infected cells is below or below 20%, the number of integrations is approximately equal to the number of transduced cells. However, at higher transduction levels, the fraction of transduced cells with multiple integrations increases, so the percentage of transduced cells relative to integration events per cell is no longer linear. Using the chart below, the number of integrations per cell, or MOI, can be accurately estimated for cultures with up to 75% transduced cells (i.e., MOIs in the range of ~0.2-1.5).

Figure 3.1 Estimate the percentage of infected cells based on the number of infections (MOI)

Result discussion

Figure 3.2 Result of MATLAB

In the results generated by MATLAB, we observe the following:

1.Cell Concentrations Over Time:

Virus-free cells (Cf): The concentration remains relatively constant throughout the observed period, indicating minimal change in the population of cells not carrying the virus.

Virus-carrying cells (Cc): There is a significant increase in the concentration of virus-carrying cells over time, indicating successful infection and replication of the virus within the cells.

Cells with integrated virus (Ci): The concentration of cells with integrated virus shows a gradual increase, suggesting that integration into the host genome occurs over time, leading to stable expression.

2.Transfection Efficiency Over Time:

The transfection efficiency initially increases, reaching a peak around 3-4 hours, after which it starts to decline. This indicates an optimal window for achieving maximum transfection before efficiency decreases.

Under static conditions, HEK293T cells formed a monolayer with a maximum density between 5.65×10^5 and 1.05×10^6 cells/cm².

To modify the MATLAB code based on the provided protocol information, we need to make a few updates to ensure it aligns with the experimental setup:

1.Adjust the Parameters:

Growth rate of target cells (mu): Average around 0.033 per hour (based on the document).

Maximum target cell concentration (Ct_max): Use a value between 5.65×10^5 and 1.05×10^6 cells/cm², average 7.52×10^5 cells/cm². Nuclear import rate constant (Kn): Adjust to fit the provided range of 0.029 - 0.15 per hour, use the average for calculations.

2.Initial Conditions: Adjust initial viral vector concentration and cell concentrations based on your experiment setup if needed.

3.Time Period: Ensure the simulation time matches the experimental timeline (e.g., Day 0 to Day 3 = 72 hours).

Figure 3.3 Result of MATLAB(using actual experimental parameter from protocol）

In the results generated by MATLAB, it is observed that the concentration of virus-carrying cells (Cc) increases significantly over time, surpassing 10×10^5 cells/cm² by the 5-hour mark, while the concentration of virus-free cells (Cf) remains relatively constant, showing minimal change. Cells with integrated virus (Ci) also display a gradual increase, although at a much slower rate compared to those of virus-carrying cells. This suggests successful viral infection and eventual integration into the host genome over time. Furthermore, transfection efficiency peaks around 3-4 hours, reaching approximately 0.6%, before beginning to decline. Based on the provided protocol, this indicates that the optimal window for transfection lies between 3 and 4 hours post-infection. To enhance the experimental design, consideration should be given to adjusting the timing of the media replacement (Day 2-3 in the protocol) to coincide more precisely with this peak efficiency. Moreover, expanding polyclonal populations at this point may yield better results, as it could prevent the loss of cells with high transgene expression, which may occur because of the reduced growth rates observed in these cells over time. Furthermore, early generation of monoclonal lines could help maintain stable expression levels in the long term.

By providing a mathematical description of these processes, this model offers a valuable tool for researchers and clinicians working with lentiviral vectors. It can help optimize transfection protocols, predict gene transfer efficiencies, and guide the development of more effective gene delivery strategies for both research and therapeutic applications.

This comprehensive approach to modeling lentiviral transfection provides a foundation for understanding the complex interplay of factors that influence the success of gene transfer, ultimately contributing to the advancement of gene therapy and genetic engineering techniques.

Reference

Futagi, D., & Kitano, K. (2015). Ryanodine-receptor-driven intracellular calcium dynamics underlying spatial association of synaptic plasticity. Journal of Computational Neuroscience, 39(3), 329–347. https://doi.org/10.1007/s10827-015-0579-z

Han, J. M., & Periwal, V. (2019). A mathematical model of calcium dynamics: Obesity and mitochondria-associated ER membranes. PLOS Computational Biology, 15(8), e1006661. https://doi.org/10.1371/journal.pcbi.1006661

Han, J. M., Tanimura, A., Kirk, V., & Sneyd, J. (2017). A mathematical model of calcium dynamics in HSY cells. PLOS Computational Biology, 13(2), e1005275. https://doi.org/10.1371/journal.pcbi.1005275

Hodeify, R., Yu, F., Courjaret, R., Nader, N., Dib, M., Sun, L., Adap, E., Hubrack, S., & Machaca, K. (2018). Regulation and role of store-operated ca2+ entry in cellular proliferation. In J. A. Kozak & J. W. Putney (Eds.), Calcium Entry Channels in Non-Excitable Cells. CRC Press/Taylor & Francis. http://www.ncbi.nlm.nih.gov/books/NBK531436/

Mirzakhalili, E., Epureanu, B. I., & Gourgou, E. (2018). A mathematical and computational model of the calcium dynamics in Caenorhabditis elegans ASH sensory neuron. PLoS ONE, 13(7), e0201302. https://doi.org/10.1371/journal.pone.0201302

Rahmani, B., Jelbart, S., & Kirk, V. (2024). Understanding broad-spike oscillations in a model of intracellular calcium dynamics (arXiv:2401.16839). arXiv.https://doi.org/10.48550/arXiv.2401.16839

Sobie, E. A., Dilly, K. W., Dos Santos Cruz, J., Lederer, W. J., & Saleet Jafri, M. (2002). Termination of cardiac ca2+ sparks: An investigative mathematical model of calcium-induced calcium release. Biophysical Journal, 83(1), 59–78. https://doi.org/10.1016/S0006-3495(02)75149-7

Szopa, P., Dyzma, M., & Kaźmierczak, B. (2013). Membrane associated complexes in calcium dynamics modelling. Physical Biology, 10(3), 035004. https://doi.org/10.1088/1478-3975/10/3/035004

Tajada, S., & Villalobos, C. (2020). Calcium permeable channels in cancer hallmarks. Frontiers in Pharmacology, 11. https://doi.org/10.3389/fphar.2020.00968

Wacquier, B., Combettes, L., Van Nhieu, G. T., & Dupont, G. (2016). Interplay between intracellular ca2+ oscillations and ca2+-stimulated mitochondrial metabolism. Scientific Reports, 6, 19316. https://doi.org/10.1038/srep19316

Zheng, S., Wang, X., Zhao, D., Liu, H., & Hu, Y. (2023). Calcium homeostasis and cancer: Insights from endoplasmic reticulum-centered organelle communications. Trends in Cell Biology, 33(4), 312–323. https://doi.org/10.1016/j.tcb.2022.07.004

Zhou, A., Liu, X., Zhang, S., & Huo, B. (2020). Effects of store-operated and receptor-operated calcium channels on synchronization of calcium oscillations in astrocytes. Biosystems, 198, 104233. https://doi.org/10.1016/j.biosystems.2020.104233

rya, D. K., Deshpande, H., Kumar, A., Chidambaram, K., Pandey, P., Anjum, S., Deepak, P., Kumar, V., Kumar, S., Pandey, G., Srivastava, S., & Rajinikanth, P. S. (2024). Her-2 receptor and αvβ3 integrin dual-ligand surface-functionalized liposome for metastatic breast cancer therapy. Pharmaceutics, 16(9), 1128. https://doi.org/10.3390/pharmaceutics16091128

Li, N., Qiu, S., Fang, Y., Wu, J., & Li, Q. (2021). Comparison of linear vs. Cyclic rgd pentapeptide interactions with integrin αvβ3 by molecular dynamics simulations. Biology, 10(7), 688. Biology, 10(7), 688. https://doi.org/10.3390/biology10070688

Jumper, J., Evans, R., Pritzel, A., Green, T., Figurnov, M., Ronneberger, O., Tunyasuvunakool, K., Bates, R., Žídek, A., Potapenko, A., Bridgland, A., Meyer, C., Kohl, S. A. A., Ballard, A. J., Cowie, A., Romera-Paredes, B., Nikolov, S., Jain, R., Adler, J., … Hassabis, D. (2021). Highly accurate protein structure prediction with AlphaFold. Nature, 596(7873), 583–589.https://doi.org/10.1038/s41586-021-03819-2

Mirdita, M., Schütze, K., Moriwaki, Y., Heo, L., Ovchinnikov, S., & Steinegger, M. (2022). ColabFold: Making protein folding accessible to all. Nature Methods, 19(6), 679–682. https://doi.org/10.1038/s41592-022-01488-1

Tayi, V. S., Bowen, B. D., & Piret, J. M. (2009). Mathematical model of the rate-limiting steps for retrovirus-mediated gene transfer into mammalian cells. Biotechnology and Bioengineering, 104(3), 550-563. https://doi.org/10.1002/bit.22515

Lentiviral titer calculation—Packaging, titering, and transduction of lentiviral constructs—V10a. (n.d.). Cellecta, Inc. https://manuals.cellecta.com/lentiviral-construct-packaging-and-transduction/v10a/en/topic/lentiviral-titer-calculation

UM-MACAU

Calcium Dynamic Model

Protein Molecular Model

Lentivirus transfection model

Reference

Calcium Dynamic Model

Overview

Reviewing current articles/studies

Construction

Background

Key parameters and values

Equations

Result Discussion

Protein Molecular Model

Introduction

Structure Prediction

Background

Methodology

Results

Comparative Analysis

Molecular docking

Background

Methodology

Results

Developmental analysis of the proteins

Background

Methodology

Results

Redesign of protein molecules

Virtual amino acid mutations increase the binding affinity of protein

Background

Methodology

Results

Lentivirus transfection model

Background

Lentivirus Production

Target Cell Infection and Stable Cell Line Generation

Infection Process

The mathematical model of the process

Key Parameters

Actual Data

Equations

Result discussion

Reference