/$$$$$$   /$$$$$$  /$$$$$$$$ /$$$$$$$$ /$$      /$$  /$$$$$$  /$$$$$$$  /$$$$$$$$
              /$$__  $$ /$$__  $$| $$_____/|__  $$__/| $$  /$ | $$ /$$__  $$| $$__  $$| $$_____/
            | $$  \__/| $$  \ $$| $$         | $$   | $$ /$$$| $$| $$  \ $$| $$  \ $$| $$      
            |  $$$$$$ | $$  | $$| $$$$$      | $$   | $$/$$ $$ $$| $$$$$$$$| $$$$$$$/| $$$$$   
              \____  $$| $$  | $$| $$__/      | $$   | $$$$_  $$$$| $$__  $$| $$__  $$| $$__/   
              /$$  \ $$| $$  | $$| $$         | $$   | $$$/ \  $$$| $$  | $$| $$  \ $$| $$      
            |  $$$$$$/|  $$$$$$/| $$         | $$   | $$/   \  $$| $$  | $$| $$  | $$| $$$$$$$$
              \______/  \______/ |__/         |__/   |__/     \__/|__/  |__/|__/  |__/|________/

GPMSS

GPMSS stands for Glutathione (GSH) Production and Membrane System Simulation.

This code simulates the dynamic transport and production of Glutathione (GSH) between two containers connected by membranes with selective permeability. The model captures the kinetic changes in GSH concentration over time as a function of production rates and the flux determined by membrane pore sizes. The system is designed for biochemical research and can help optimize the separation and production processes for small biomolecules such as GSH.

Features

Two-Container System: Simulates GSH production in Container A, with flux to Container B through a membrane.
Selective Permeability: Models membrane with two different pore sizes to selectively transport GSH and smaller molecules.
Mathematical Model: Includes differential equations that govern the concentration changes in both containers over time, accounting for membrane fouling and selective flux.
Real-Time Monitoring: Code can be expanded to integrate real-time sensors to monitor GSH concentration, adjust flow rates, and optimize system performance.

Dependencies

To run this code, you will need:

Python 3.x
Jupyter Notebook (for running .ipynb files)
NumPy: For numerical operations
SciPy: For differential equation solving
Matplotlib: For visualization of results
SymPy: For symbolic mathematics (if symbolic computation is used)

Install the required dependencies using pip:

pip install numpy scipy matplotlib sympy

How to Run

Clone the repository or download the code.
Open the Jupyter notebook file (.ipynb) in your preferred environment (e.g., JupyterLab, Google Colab, or locally).
Run all cells in the notebook to execute the simulation.

If running outside of a notebook environment, ensure to run the following:

jupyter notebook <filename>.ipynb

GSH Biosynthesis Simulation

The ssa_algorithm_biosynthesis_gsh.py python script for GSH_Biosynthesis_Simulation simulates the biosynthesis of Glutathione (GSH) using Gillespie's Stochastic Simulation Algorithm (SSA) and Reaction Rate Equation (RRE). The reaction involves two chemical reactions:

1. L-Glu + Cys → γ-GC

2. γ-GC + Gly → GSH

The algorithm simulates the stochastic nature of the chemical reactions, modeling the time evolution of the chemical species. Additionally, the system is validated using an Ordinary Differential Equation (ODE) solver for a deterministic comparison.

Features

Stochastic Simulation:: Utilizes Gillespie's algorithm to simulate random reaction times and events.
ODE Solver Comparison:: Provides a deterministic solution for comparison using "scipy.integrate.odeint".
Dynamic Visualization:: Plots the abundance of each chemical species over time, comparing the stochastic simulation and the ODE solution.
Multiple Reaction Support:: Handles complex multi-step reactions between chemical species.

Dependencies

To run this code, you will need:

Python 3.x
numpy library for numerical operations
sciPy: For differential equation solving
Matplotlib: For visualization of results

Install the required dependencies using pip:

pip install numpy scipy matplotlib

How to Run

Clone the repository or download the code.
Run the python script using an IDE.

Investigating thermostability of gshF enzyme using computational methods

1. Calculate_B_factors

The calculate_B_factors.py script is a python script designed to compute and analyze the average B-factors for protein loops based on structural data from Protein Data Bank (PDB) files. B-factors provide insights into the atomic displacement or flexibility in protein structures. By focusing on specific loops in the protein, researchers can gain a deeper understanding of the flexibility or rigidity in those regions, which can be critical for functional and stability studies of proteins. This tool automates the process of parsing PDB files, extracting residue-level B-factors, and calculating the average B-factor for protein loops defined in a CSV file. It then outputs the average B-factors for each loop in another CSV file, which can be used for further analysis, visualization, or integration with other structural bioinformatics tools.

Features

PDB File Parsing: Utilizes Gillespie's algorithm to simulate random reaction times and events.
Residue-Level Analysis: Provides a deterministic solution for comparison using "scipy.integrate.odeint".
Loop Definition from CSV: Plots the abundance of each chemical species over time, comparing the stochastic simulation and the ODE solution.
Loop B-Factor Calculation: Handles complex multi-step reactions between chemical species.
CSV Output

Dependencies

To run this code, you will need:

Python 3.x
The pdb file to the same folder with the script.

How to Run

Clone the repository or download the code.
Run the python script using an IDE.

2. Calculate Loop Depth

The calculate_loopdepth.py script is a python script designed to compute the estimated depth of protein loops relative to the overall center of mass of the protein, based on structural data from Protein Data Bank (PDB) files. The depth of loops in a protein can offer insight into their potential roles in protein dynamics, function, and interactions with other molecules. By identifying the relative position of these loops with respect to the protein's center of mass, this script provides valuable information for structural biologists and bioinformaticians. This tool automates the process of parsing PDB files to extract atomic coordinates, computing the protein's center of mass, and calculating the depth of specified loops. The results are output into a CSV file for easy interpretation and further analysis.

Features

PDB File Parsing: Extracts atomic coordinates from standard PDB files, handling both ATOM and HETATM records.
Center of Mass Calculation: Computes the center of mass of the protein using the atomic coordinates.
Loop Definition from CSV: Reads loop information (start and end residues) from a CSV file, allowing users to define which loops to analyze.
Loop Depth Calculation: Estimates the depth of each loop by calculating the Euclidean distance between the loop's center and the overall center of mass.
CSV Output: Saves the computed loop depths into a CSV file for further analysis, plotting, or integration with other tools.

Dependencies

To run this code, you will need:

Python 3.x
numpy library for numerical operations

How to Run

Clone the repository or download the code.
Run the python script using an IDE.

3. Identify Loops

The identify_loops.py script utilizes the output of the STRIDE program, which provides information about the secondary structure assignments for gshF enzyme.

Features

Reading the STRIDE File: The script reads a specified STRIDE output file that contains lines of text representing different secondary structure assignments. Each line includes details such as the type of secondary structure, the residue range, and the associated chain.
Storing Secondary Structure Ranges: It extracts and stores the start and end residues of each secondary structure element for each protein chain in a dictionary. This enables easy access and manipulation of the structure data.
Identifying Loop Regions: The script identifies loop regions by analyzing gaps between the end of one secondary structure element and the start of the next. For each chain, it checks the sorted ranges of secondary structures and determines if there are any gaps, which are defined as potential loop regions.
Handling Open-ended Loops: In addition to identifying closed loops (those with defined start and end residues), the script also accounts for open-ended loops, which may extend to the end of the protein chain.
Output: Finally, the script prints the identified loop regions to the console, providing a clear overview of where these loops occur within the protein structure.

Dependencies

To run this code, you will need:

Python 3.x
The STRIDE file to the same folder with the script.

How to Run

Clone the repository or download the code.
Run the python script using an IDE.

4. Predict DDG

The predict_DDG.py computes the ΔΔG (change in Gibbs free energy) for all possible single-point mutations in a given protein structure. The script utilizes PyRosetta, a Python-based interface for the Rosetta molecular modeling suite, to perform energy calculations before and after introducing mutations. The results are saved in an Excel file, providing a convenient way to analyze the effects of mutations on protein stability. Unfortunately due to limitation of access to the PyRosseta, we could not run the script predict_DDG.py to generate real data.

Features

Automated ΔΔG Calculation: The script first calculates the energy of the wild-type protein using PyRosetta's full-atom scoring function.
Mutation Introduction: The script introduces a mutation to each of the 20 standard amino acids at every residue position, except for the wild-type amino acid.
Mutant Energy Calculation: After mutating the residue, the script calculates the energy of the mutant protein structure.
ΔΔG Calculation: The difference in energy between the mutant and the wild-type structure is computed (ΔΔG = Mutant Energy - Wild-Type Energy).
Results Storage: The ΔΔG values, along with the mutation details, are stored in a Pandas DataFrame and then saved into an Excel file for further analysis.

Dependencies

To run this code, you will need:

Python 3.x
pandaslibrary
openpyxllibrary
PyRosseta

Install the required dependencies using pip:

pip install pandas openpyxl pyrosetta

How to Run

Clone the repository or download the code.
Run the python script using an IDE.

Contributing to e-PHAESTUS

We welcome contributions from the community that can help improve e-PHAESTUS and enhance its utility for researchers and scientists working on biomanufacturing and bioremediation, particularly in addressing e-waste through bioleaching. If you're interested in contributing, please follow these steps to get started:

Fork the Repository: Start by forking the e-PHAESTUS repository to your GitLab account.
Clone the Forked Repository: Clone the forked repository to your local machine using the following command:
```
git clone https://gitlab.igem.org/2024/software-tools/athens.git
```
Create a New Branch: Create a new branch for your contribution. Choose a descriptive name that reflects the nature of your contribution.
```
git checkout -b feature/new-feature
```
Make Changes: Make the necessary changes and improvements to the codebase, focusing on enhancing Glutathione production in E. coli or optimizing bioleaching models.
Test Your Changes: Before submitting a pull request, ensure that your changes are thoroughly tested and do not introduce any issues to the system.

Commit and Push: Commit your changes and push them to your forked repository.

git commit -m "Add your commit message here"

git push origin feature/new-feature

Submit a Pull Request: Go to the e-PHAESTUS repository on GitLab and click on the “New Pull Request” button. Choose your branch and provide a clear description of your changes and how they align with our project goals.
Code Review: Your pull request will be reviewed by the maintainers. Be prepared to address any feedback or suggestions.

Authors and Acknowledgments

The software tools developed for e-PHAESTUS are the result of the hard work and dedication of the Dry Lab members of the iGEM Athens 2024 team. Their efforts have been vital in building the computational framework that supports the biomanufacturing and bioleaching processes central to the project. We express our sincere gratitude to these team members for their invaluable contributions.

Core Development Team

The iGEM Athens 2024 Dry Lab team has worked collaboratively to develop the models and simulations used in e-PHAESTUS. These tools enable the optimization of Glutathione (GSH) production through synthetic biology, supporting the bioleaching of valuable metals from e-waste. Their work has not only contributed to the success of e-PHAESTUS but has also laid the foundation for future advancements in bioproduction and bioremediation technologies.

We also recognize the contributions of previous research and software tools that have provided critical insights and functionalities to our work. The following resources have greatly influenced our development process:

Lu, S.C. (2013). "Glutathione Synthesis." Biochimica et Biophysica Acta, 1830(5), 3143-3153. DOI:10.1016/j.bbagen.2012.09.008. This paper provided a comprehensive understanding of Glutathione synthesis, which was pivotal to our biological modeling.
AutoDock Vina: For molecular docking studies that supported the prediction of GSH binding to metal ions during bioleaching.
RDKit: Used for cheminformatics tasks, including molecular manipulation and analysis.

We also extend our thanks to the maintainers of the following software tools, which have been instrumental in our modeling and simulation efforts:

NumPy: Essential for the numerical computations underlying our models.
SciPy: Used for solving differential equations within our system.
Matplotlib: For visualizing the dynamic changes in GSH concentration and the bioleaching process.

Future Contributions

We welcome contributions from the community to further improve the tools and simulations used in e-PHAESTUS. Whether you are a biologist, chemist, or software developer, your contributions can help drive the project forward. Please refer to the “Contributing” section for detailed instructions on how to get involved.