Project Description

Integrated Modelling of Protein Complexes Via Single-shot regitration using DREAM

graphical_abstract
Graphical Abstract

Background

The Integrative Modelling Platform (IMP) provides a computational approach designed to model the structure of macromolecular assemblies. Using Bayesian inference, IMP models biomolecular systems ranging from small peptides to large macromolecular complexes by integrating data from experiments, statistical analyses, physical principles, and prior models.

IMP frames the construction of structural models as a computational optimization problem, where information about the assembly is encoded into a scoring function that evaluates candidate models. These scoring functions comprise terms known as restraints, which measure how well a model aligns with the information from which the restraint was derived. The restraints incorporate both general structural knowledge and specific details about the target structure. Consequently, a candidate model that scores well is consistent with all available information. The precision and accuracy of the resulting model improve with the amount and quality of information encoded in the restraints. Integrative modelling facilitates the incorporation of new and varied information, lowering the barrier for using incremental data that is typically not applied to structural characterization. Even when individual data types are relatively uninformative, the integration of multiple types can provide a comprehensive and accurate picture of an assembly. This approach often results in more precise and complete models than those based on single data sources (1).

Additionally, IMP offers a framework for both static and dynamic modelling, enhancing its utility across a range of biomolecular systems.

Approach

IMP is a powerful tool for modelling macromolecular assemblies, but several challenges must be addressed to enhance its accuracy and extend the scope of its applications. These challenges include optimizing model representation, expanding the variety of computable models, and incorporating diverse types of data (2). Additionally, there is a need for improved methods to score models, sample models, and analyze and interpret results. Currently, IMP employs Markov Chain Monte Carlo (MCMC) sampling to explore the space of possible models, which can be computationally expensive and slow. To address this, we propose developing IMPROViSeD an IMP-based software tool that models the structure of macromolecular assemblies using a bottom-up approach.

Our approach will utilize the Distance Restraint and Energy Assisted Modelling (DREAM) algorithm, a novel method for modelling the structure of macromolecular assemblies (4). This algorithm follows a bottom-up strategy, building smaller substructures for regions with a high concentration of experimental data and consolidating them before modelling the rest of the protein-complex structure. This method improves structure conformance in the final models, ensuring higher compliance with experimental data. It provides a faster and scalable approach to modelling macromolecular assemblies by using a parallel-processed single-step assembly of complexes, as opposed to the conventional iterative assembly.

Motivation

Proteins are the driving forces behind cellular processes, participating in various biological functions such as signal transduction, gene regulation, and cell division. They do not act in isolation but collaborate with other proteins to form macromolecular assemblies. These assemblies are crucial for the cell’s proper functioning, performing complex tasks that no single protein can accomplish alone. Understanding the structure and dynamics of these assemblies is key to comprehending biological systems.

However, the inability to determine the structure of macromolecular assemblies has been a significant obstacle in structural biology. Initially, NMR problems were addressed using distance geometry, but its lack of scalability limited its utility (5). This led to a focus on molecular dynamics, MCMC, and other methods, which are computationally intensive. Modelling the structure and dynamics of macromolecular assemblies can provide in sights into the workings, evolution, control, and design of biological systems. Our iGEM project aims to use Integrative Modelling Platform (IMP) with the DREAM algorithm to build models of macromolecular assemblies using a bottom-up approach, overcoming existing bottlenecks.

References

  1. Russel D, Lasker K, Webb B, Velázquez-Muriel J, Tjioe E, et al. (2012) Putting the Pieces Together: Integrative Modelling Platform Software for Structure Determination of Macromolecular Assemblies.
  2. Andrej Sali, From integrative structural biology to cell biology, Journal of Biological Chemistry, Volume 296, 2021.
  3. Arvindekar, Shreyas, Kartik Majila, and Shruthi Viswanath. "Recent methods from statistical inference and machine learning to improve integrative modelling of macromolecular assemblies." arXiv preprint arXiv:2401.17894 (2024).
  4. Das, Niladri Ranajan et al. “DREAMweb: An online tool for graph-based modelling of NMR protein structure.” Proteomics, e2300379. 17 Apr. 2024.
  5. Das NR, Chaudhury KN, Pal D. Improved NMR-data-compliant protein structure modelling captures context-dependent variations and expands the scope of functional inference. Proteins. 2023; 91(3): 412-435.