Model

Explain your model's assumptions, data, parameters, and results in a way that anyone could understand.

Modeling


The development of ClusterControl involved a combination of computational and structural methods to optimize gene cluster regulation. To ensure the highest predictive accuracy and functionality, we employed evolutionary analysis, contact interface evaluation, structural analysis using PyMOL, and molecular dynamics simulations. Our aim was to create a comprehensive tool that predicts functional rewiring points and helps in practical applications.

Evolutionary Analysis

We started by performing a detailed evolutionary analysis to identify critical regions within regulatory proteins that are conserved across different species. This step used a sequence-based approach, where sequences with 20-80% similarity were aligned and analyzed using UniRef50 BLAST. The direct information matrix was then computed to evaluate the correlation between these sequences, filtering out noisy interactions using a Z-score threshold within three standard deviations. This allowed us to generate a scoring matrix highlighting positions that are highly conserved and essential for protein functionality. These regions were used as a baseline to identify potential sites for modular swapping, ensuring minimal disruption to the protein's overall behavior.

Figure 1: ClusterControl workflowClusterControl workflow

Contact Interface Analysis

After narrowing down the ideal swapping points from the evolutionary analysis, we conducted contact interface analysis to determine the feasibility of rewiring these positions without affecting inter-domain interactions. This step involved calculating the Disruption Distance in Z-score (DDIZ) for each candidate swap site, a metric that quantifies how much the modified protein deviates from its original interaction profile. Lower DDIZ scores indicate minimal disruption and higher chances of retaining original functionality.

We then cross-referenced these scores with experimental fluorescence data as a proxy for function, paying special attention to points where a low DDIZ score did not correlate with high fluorescence. These discrepancies revealed hidden complexities in protein behavior.

Figure 2: Graph showing correlation between DDIZ scores and activityGraph showing correlation between DDIZ scores and activity

Figure 3: Structural modelling of OmpR in Complex with EnvZStructural modelling of OmpR in Complex with EnvZ along with functional/nonfunctional points marked

Structural Analysis with PyMOL

Once potential rewiring points were identified, we used PyMOL for detailed structural analysis. The goal of this analysis was to pinpoint structural differences between regulators that showed a strong correlation between DDIZ and fluorescence change and those that did not. This involved comparing the 3D configurations of both functional and non-functional constructs, focusing on the positioning of crossover points and changes in structural stability.

During the analysis, the polarity of amino acids adjacent to the rewiring points was examined, and it was observed that functional structures were generally more hydrophilic around these regions. The solvent-accessible surface area (SASA) at these sites was calculated to determine if exposure to the surrounding environment played a role in functionality. However, a subsequent t-test revealed no statistically significant difference in SASA between functional and non-functional regulators, indicating that solvent exposure alone is not a determining factor for successful rewiring. Instead, the polarity of neighboring residues and the precise spatial arrangement are more critical for maintaining function.

Figure 4: Example of hydrophobic residues near the rewiring point (within 5 Angstrom) of OmpR-CcaRExample of hydrophobic residues near the rewiring point (within 5 Angstrom) of OmpR-CcaR

Molecular Dynamics Simulations

To further validate the stability of our designed regulators, we are currently running molecular dynamics simulations on the refined constructs. These simulations, performed using tools such as GROMACS, evaluate how the modified proteins behave under physiological conditions, accounting for thermal fluctuations and interactions with other molecules. The results will provide insights into the stability and dynamic behavior of the redesigned proteins, confirming whether the suggested rewiring points can maintain structural integrity and functionality over time.

Software Development

All these analyses were integrated into the ClusterControl software, which was built as a comprehensive tool for regulatory design. A user-friendly interface allows researchers to input sequences, select from suggested rewiring points, and visualize the impact of these modifications on protein structure. The software is designed to be modular, enabling seamless updates as new data and methodologies become available. By combining these diverse computational and structural techniques, ClusterControl offers a unique platform for the precise control of gene clusters. This allows researchers to minimize the time spent on optimization, enabling the development of versatile biosensors and synthetic regulation pathways more efficiently.