Model

Get docking site


Firstly, molecular docking and molecular dynamics simulation of the enzyme and substrate were performed on the wild type of the two enzymes to obtain the docking site between the substrate and the enzyme, i.e. the amino acid residues on the enzyme that interact with the substrate.

The following specific operations were performed using GROMACS:

1.Ligand preparation

When preparing the topology file of the ligand, GAFF (General AMBER Force Field) is used to identify the atom type. GAFF atom types enable the force field to more accurately describe the intramolecular and intermolecular interactions by detailed classification of atoms in different chemical environments, which facilitates the subsequent processing of the receptor file.

2.Receptor preparation

Select the AMBER99SB force field when preparing the receptor topology file. The AMBER99SB force field is widely used in protein structure simulation, dynamics analysis, free energy calculation and other fields. It performs particularly well in simulating protein folding and conformational changes. In addition, the AMBER99SB force field is compatible with the GAFF force field for identifying ligand small molecules, which can greatly improve the subsequent simulation process.

3.Simulation environment parameter settings

a) Building a simulation box

Establish a cubic closed simulation environment,Define the minimum distance between a molecule and the edge of the box as 1.0 nm, and place the protein receptor molecule in the center of the box.

b) Adding solvent and ions

The solvent selected is the SPC (Simple Point Charge) water model. The SPC model models the water molecule as a rigid triangular structure: the O-H bond length is 1.0 Å, and the H-O-H bond angle is 109.47 degrees. The SPC water model is a classic molecular dynamics simulation water model used to describe the interaction of water molecules. This model does not consider the polarization effect of the molecule, has high computational efficiency, and is suitable for the simulation study of biomacromolecules in water environments.

The added ions are Na+ and Cl-, simulating a salt solution environment and automatically adding sufficient amounts of ions to neutralize the total charge of the system.

4. Energy minimization

Energy minimization is performed to optimize the geometric structure of the molecule, reduce the high energy areas in the system, and make the system in a more stable state.

The maximum number of iterations was set to 10,000, the maximum step length of each step was 0.01 nm, and the convergence criterion for energy minimization was 1000.0 kJ/mol/nm, that is, when the maximum force in the system was less than 1000.0 kJ/mol/nm, energy minimization was stopped.

After the parameters are set, the steepest descent method and the conjugate gradient method are used successively to perform energy minimization. Using the above two algorithms successively can quickly eliminate unreasonable structures with large energy gradients and then perform more refined energy minimization processing to ensure that the system gradually reaches a more stable state.

5. Pre-balance treatment

By performing NVT (constant temperature and volume) and NPT (constant temperature and pressure) processing successively, it can ensure that the system is simulated under balanced temperature and pressure conditions, thereby improving the stability and accuracy of the simulation and making the model closer to the real physical environment.

6. Molecular dynamics simulation

After completing all the above operations, molecular dynamics simulation can be performed to calculate the optimal site for the binding of ligand and receptor and various parameters in the docking process.

Fig.1 schematic diagram of the simulation for the wild type cyp79f1

Renovation Design


Since the original docking sites are revealed from the former work, we can selectively modify the docking sequence by this principle: select amino acids with high affinity for substrate and low affinity for product, i.e. our orientation is to make the enzyme attract substrate and give out product easier.

1. Designed for 79short

Select amino acids with high affinity for dihomomethionine and low affinity for aldoxime. according to the biochemical properties of various amino acids, we choose the alternative amino acid residues between isoleucine (I) and leucine (L).

  • Blue for AlphaFold3
  • Red for SwissModel
  • Purple BoldShared by both
  • Yellow is the original simulated docking sitse

Alpha3 Model

79short_alpha3

MGRILSRPTK TKDRSCQLPP GPPGWPILGN LPELFMTRPR SKYFRLAMKE LKTDIACFNF AGIRAITINS DEIAREAFRE RDADLADRPQ LFIMETIGDN YKSMGISPYG EQFMKMKRVI TTEIMSVKTL KMLEAARTIE ADNLIAYVHS MYQRSETVDV RELSRVYGYA VTMRMLFGRR HVTKENVFSD DGRLGNAEKH HLEVIFNTLN CLPFSPADYV ERWLRGWNVD GQEKRVTENC NIVRSYNNPI IDERVQLWRE EGGKAAVEDW LDTFITLKDQ NGKYLVTPDE IKAQCVEFCI AAIDNPANNM EWTLGEMLKN PEILRKALKE LDEVVGRDRL VQESDIPNLN YLKACCRETF RIHPSAHYVP SHLARQDTTL GGYFIPKGSH IHVCRPGLGR NPKIWKDPLV YKPERHLQGD GITKEVTLVE TEMRFVSFST GRRGCIGVKV GTIMMVMLLA RFLQGFNWKL HQDFGPLSLE EDDASLLMAK PLHLSVEPRL APNLYPKFRP
Fig.2 the spinning picture of cyp79f1_short from AlphaFold3 indicating the docking sites.

79short_alpha3_op1 Change all origin positions to I

MGRILSRPTK TKDRSCQLPP GPPGWPILGN LPELFMTRPR SKYFRLAMKE LKTDIACFNF AGIRAITINS DEIAREAFRE RDADLADRPQ LFIMETIGDN YKSMGISPYG EQFMKMKRVI TTEIISVKTL KMLEAARTIE ADNLIAYVHS MYQRSETVDV RELSRVYGYA VIMRILFGRR HVTKENVFSD DGRLGNAEKH HLEVIFNTLN CLPSFSPADY VERWLRGWNV DGQEKRVTEN CNIVRSYNNP IIDERVQLWR EEGGKAAVED ILDTFITLKD QNGKYLVTPD EIKAICVIIC IIIIDIIANN MEWTLGEMLK NPEILRKALK ELDEVVGRDR LVQESDIPNL NYLKACCRET IRIHPSAHYV PSHLARQDTT LGGYFIPKGS HIHVCRPGLG RNPKIWKDPL VYKPERHLQG DGITKEVTLV ETEMRFVSIS TGRIIIIKII IIMIVMLLAR FLQGFNWKLH QDFGPLSLEE DDASLLMAKP LHLSVEPRLA PNLYPKFRP

79short_alpha3_op2 Change all origin positions to L

MGRILSRPTK TKDRSCQLPP GPPGWPILGN LPELFMTRPR SKYFRLAMKE LKTDIACFNF AGIRAITINS DEIAREAFRE RDADLADRPQ LFIMETIGDN YKSMGISPYG EQFMKMKRVL TTEILSVKTL KMLEAARTIE ADNLIAYVHS MYQRSETVDV RELSRVYGYA VLMRLLFGRR HVTKENVFSD DGRLGNAEKH HLEVIFNTLN CLPSFSPADY VERWLRGWNV DGQEKRVTEN CNIVRSYNNP IIDERVQLWR EEGGKAAVED LLDTFITLKD QNGKYLVTPD EIKALCVLLC ILLIDLLANN MEWTLGEMLK NPEILRKALK ELDEVVGRDR LVQESDIPNL NYLKACCRET LRIHPSAHYV PSHLARQDTT LGGYFIPKGS HIHVCRPGLG RNPKIWKDPL VYKPERHLQG DGITKEVTLV ETEMRFVSLS TGRRGLILLK LLLIMLVMLL ARFLQGFNWK LHQDFGPLSL EEDDASLLMA KPLHLSVEPR LAPNLYPKFR P

SwissModel

79short_swiss

MGRILSRPTK TKDRSCQLPP GPPGWPILGN LPELFMTRPR SKYFRLAMKE LKTDIACFNF AGIRAITINS DEIAREAFRE RDADLADRPQ LFIMETIGDN YKSMGISPYG EQFMKMKRVI TTEIMSVKTL KMLEAARTIE EADNLIAYVH SMYQRSETVD VRELSRVYGY AVTMRMLFGR RHVTKENVFS DDGRLGNAEK HHLEVIFNTL NCLPSFSPAD YVERWLRGWN VDGQEKRVTE NCNIVRSYNN PIIDERVQLW REEGGKAAVE DWLDTFITLK DQNGKYLVTP DEIKAQCVEF CIAAIDNPAN NMEWTLGEML KNPEILRKAL KELDEVVGRD RLVQESDIPN LNYLKACCRE TFRIHPSAHY VPSHLARQDT TLGGYFIPKG SHIHVCRPGL GRNPKIWKDP LVYKPERHLQ GDGITKEVTL VETEMRFVSF STGRRGCIGV KVGTIMMVML LARFLQGFNW KLHQDFGPLS LEEDDASLLM AKPLHLSVEP RLAPNLYPKF RP

79short_swiss_op1 Change all origin positions to I

MGRILSRPTK TKDRSCQLPP GPPGWPILGN LPELFMTRPR SKYFRLAMKE LKTDIACFNF AGIRAITINS DEIAREAFRE RDADLADIIQ LFIIETIGDN YKIMIISPYG EQFMKMKRVI TTEIMSVKTL KMLEAARTIE EADNLIAYVH SMYQRSETVD VRELSRVYGY AVTMRMLFGR RHVTKENVFS DDGRLGNAEK HHLEVIFNTI NCIISFSPAD YVERWLRGWN VDGQEKRVTE NCNIVRSYNN PIIDERVQLW REEGGKAAVE DWLDTFITLK DQNGKYLVTP DEIKAQCVIF CIIAIDNPAN NMEWTLGEML KNPEILRKAL KELDEVVGRD RLVQESDIPN LNYLKACCRE TFRIHPHYVI IILARQDTTL GGYFIPKGSH IHVCRPGLGR NPKIWKDPLV YKPERHLQGD GITKEVTLVE TEMRFVSFST GRRGCIGVKV GTIMMVMLLA RFLQGFNWKL HQDFGPLSLE EDDASLLMAK PLHLSVEPRL APNLYPKFRP

#The utilization rate of manual modification points is low

79short_swiss_op2 Change all origin positions to L

MGRILSRPTK TKDRSCQLPP GPPGWPILGN LPELFMTRPR SKYFRLAMKE LKTDIACFNF AGIRAITINS DEIAREAFRE RDADLADLLQ LFILETIGDN YKLMLLSPYG EQFMKMKRVI TTEIMSVKTL KMLEAARTIE EADNLIAYVH SMYQRSETVD VRELSRVYGY AVTMRMLFGR RHVTKENVFS DDGRLGNAEK HHLEVIFNTL NCLLSFSPAD YVERWLRGWN VDGQEKRVTE NCNIVRSYNN PIIDERVQLW REEGGKAAVE DWLDTFITLK DQNGKYLVTP DEIKAQCVLF CLLAIDNPAN NMEWTLGEML KNPEILRKAL KELDEVVGRD RLVQESDIPN LNYLKACCRE TFRIHPSAHY VLLLLARQDT TLGGYFIPKG SHIHVCRPGL GRNPKIWKDP LVYKPERHLQ GDGITKEVTL VETEMRFVSF STGRRGCLGV KVGTIMMVML LARFLQGFNW KLHQDFGPLS LEEDDASLLM AKPLHLSVEP RLAPNLYPKF RP

#New points are mostly manually modified points or near manually modified points

2. Designed for 83short

83short (83short only has Alpha3 model, Swiss's prediction model has over-folded structure)

MKPKTKRYKL PPGPSPLPVI GNLLQLQKLN PQRFFAGWAK KYGPILSYRI GSRTMVVISS AELAKELLKT QDVNFADRPP HRGHEFISYG RRDMALNHYT PYYREIRKMG MNHLFSPTRV ATFKHVREEE ARRMMDKINK AADKSEVVDI SELMLTFTNS VVCRQAFGKK YNEDGEEMKR FIKILYGTQS VLGKIFFSDF FPYCGFLDDL SGLTAYMKEC FERQDTYIQE VVNETLDPKR VKPETESMID LLMGIYKEQP FASEFTVDNV KAVILDIVVA GTDTAAAAVV WGMTYLMKYP QVLKKAQAEV REYMKEKGST FVTEDDVKNL PYFRALVKET LRIEPVIPLL IPRACIQDTK IAGYDIPAGT TVNVNAWAVS RDEKEWGPNP DEFRPERFLE KEVDFKGTDY EFIPFGSGRR MCPGMRLGAA MLEVPYANLL LSFNFKLPNG MKPDDINMDV MTGLAMHKSQ HLKLVPEKVN KY

On the contrary, select amino acids with low affinity for dihomomethionine and high affinity for aldoxime. Likewise, we choose the alternative amino acid residues between glutamine (Q) and asparagine (N)

Op1: Change all the original node positions to Q

MKPKTKRYKL PPGPSPLPVI GNLLQLQKLN PQRFFAGWAK KYGPILSYRI GSRTMVVISS AELAKELLKT QDVNFADRPP HRGHEFISYG RRDMALNHYT PYYREIRKMG MNHLFSPTRV ATFKHVREEE ARRMMDKINK AADKSEVVDI SELMLTFTNS VVCRQAFGKK YNEDGEEMKR FIKILYGTQS VLGKIFFSDF FPYCGFLDDL SGLTAYMKEC FERQDTYIQE VVNETLDPKR VKPETESMID LLMGIYKEQP FASEFTVDNV KAVILDIVVQ GTQQAAAAVV WGMTYLMKYP QVLKKAQAEV REYMKEKGST FVTEDDVKNL PYFRALVKET QRIEPQQPLL QPRACIQDTK IAGYDIPAGT TVNVNAWAVS RDEKEWGPNP DEFRPERFLE KEVDQQGTDY EFIQQGSGRR QQPQQQLQQA MQEVPYANLL LSFNFKLPNG MKPDDINMDV MTGQQMHKSQ HLKLVPEKVN KY

#Molecular simulation again, most of the secondary docking sites of alpha3 and swiss models are manually modified, and the overlap rate of the two models is high

Op2:Change all the original node positions to N

MKPKTKRYKL PPGPSPLPVI GNLLQLQKLN PQRFFAGWAK KYGPILSYRI GSRTMVVISS AELAKELLKT QDVNFADRPP HRGHEFISYG RRDMALNHYT PYYREIRKMG MNHLFSPTRV ATFKHVREEE ARRMMDKINK AADKSEVVDI SELMLTFTNS VVCRQAFGKK YNEDGEEMKR FIKILYGTQS VLGKIFFSDF FPYCGFLDDL SGLTAYMKEC FERQDTYIQE VVNETLDPKR VKPETESMID LLMGIYKEQP FASEFTVDNV KAVILDIVVN GTNNAAAAVV WGMTYLMKYP QVLKKAQAEV REYMKEKGST FVTEDDVKNL PYFRALVKET NRIEPNNPLL NPRACIQDTK IAGYDIPAGT TVNVNAWAVS RDEKEWGPNP DEFRPERFLE KEVDNNGTDY EFINNGSGRR NNPNNNLNNA MNEVPYANLL LSFNFKLPNG MKPDDINMDV MTGNNMHKSQ HLKLVPEKVN KY

#Transformation failed, both alpha and swiss models were over-folded in the pre-balance phase

Op3: Change the original node position to Q or N ,Keep the purple Q in op1 and change the remaining Q to N

MKPKTKRYKL PPGPSPLPVI GNLLQLQKLN PQRFFAGWAK KYGPILSYRI GSRTMVVISS AELAKELLKT QDVNFADRPP HRGHEFISYG RRDMALNHYT PYYREIRKMG MNHLFSPTRV ATFKHVREEE ARRMMDKINK AADKSEVVDI SELMLTFTNS VVCRQAFGKK YNEDGEEMKR FIKILYGTQS VLGKIFFSDF FPYCGFLDDL SGLTAYMKEC FERQDTYIQE VVNETLDPKR VKPETESMID LLMGIYKEQP FASEFTVDNV KAVILDIVVQ GTQQAAAAVV WGMTYLMKYP QVLKKAQAEV REYMKEKGST FVTEDDVKNL PYFRALVKET NRIEPNQPLL QPRACIQDTK IAGYDIPAGT TVNVNAWAVS RDEKEWGPNP DEFRPERFLE KEVDNNGTDY EFINNGSGRR NNPNNNLNNA MNEVPYANLL LSFNFKLPNG MKPDDINMDV MTGNQMHKSQ HLKLVPEKVN KY

#Swiss model over folded #The new points have a high overlap with the manual points

3. Design for linker

  • Blue is ligand
  • Red is aldoxime
  • Purple is shared

Linker

MLSLRQSIRF FKPATRTLCS SRYLLQMGRI LSRPTKTKDR SCQLPPGPPG WPILGNLPEL FMTRPRSKYF RLAMKELKTD IACFNFAGIR AITINSDEIA REAFRERDAD LADRPQLFIM ETIGDNYKSM GISPYGEQFM KMKRVITTEI MSVKTLKMLE AARTIEADNL IAYVHSMYQR SETVDVRELS RVYGYAVTMR MLFGRRHVTK ENVFSDDGRL GNAEKHHLEV IFNTLNCLPS FSPADDYVER
WLRGWNVDGQ EKRVTENCNI VRSYNNPIID ER
VQLWREEGGK AAVEDWLDTF ITLKDQNGKY LVTPDEIKAQ CVEFCIAAID NPANNMEWTL GEMLKNPEIL RKALKELDEV VGRDRLVQES DIPNLNYLKA CCRETFRIHP SAHYVPSHLA RQDTTLGGYF IPKGSHIHVC RPGLGRNPKI WKDPLVYKPE RHLQGDGITK EVTLVETEMR FVSFSTGRRG CIGVKVGTIM MVMLLARFLQ GFNWKL
HQDFGPLSLE EDDASLLMAK PLHLSVEPRL APNLYPKFRP GGGGSMKPKT KRYKLPPGPS PLPVIGNLLQ LQKLNPQRFF AGWAKKYGPI LSYRIGSRTM VVISSAELAK ELLKTQDVNF ADRPPHRGHE FISYGRRDMA LNHYTPYYRE IRMGMNHLFS PTRVATFKHV REEEARRMMD KINKAADKSE VVDISELMLT FTNSVVCRQA FGKKYNEDGE EMKRFIKILY GTQSVLGKIF FSDFFPYCGF LDDLSGLTAY MKECFERQDT YIQEVVNETL DPKRVKPETE SMIDLLMGIY KEQPFASEFT VDNVKAVILD IVVAGTDTAA AAVVWGMTYL MKYPQVLKKA QAEVREYMKE KGSTFVTEDD VKNLPYFRAL VKETLRIEPV IPLLIPRACI QDTKIAGYDI PAGTTVNVNA WAVSRDEKEW GPNPDEFRPE RFLEKEVDFK GTDYEFIPFG SGRRMCPGMR LGAAMLEVPY ANLLLSFNFK LPNGMKPDDI NMDVMTGLAM HKSQHLKLVP EKVNKY

Linker_op1

MLSLRQSIRF FKPATRTLCS SRYLLQMGRI LSRPTKTKDR SCQLPPGPPG WPILGNLPEL FMTRPRSKYF RLAMKELKTD IACFNFAGIR AITINSDEIA REAFRERDAD LADLLQLFIL ETIGDNYKLM LLSPYGEQFM KMKRVITTEI MSVKTLKMLE AARTIEADNL IAYVHSMYQR SETVDVRELS RVYGYAVTMR MLFGRRHVTK ENVFSDDGRL GNAEKHHLEV IFNTLNCLLS
FSPADYVERW LRGWNVDGQE KRVTENCNIV RSYNNPIIDE RVQLWREEGG KAAVEDWLDT FITLKDQNGK YLVTPDEIKA QCVLFCLLAI DNPANNMEWT LGEMLKNPEI LRKALKELDE VVGRDRLVQE SDIPNLNYLK ACCRETFRIH PSAHYVLLLL ARQDTTLGGY FIPKGSHIHV CRPGLGRNPK IWKDPLVYKP ERHLQGDGIT KEVTLVETEM RFVSFSTGRR GCLGVKVGTI MVMLLARFLQ GFNWKLHQDF GPLSLEEDDA SLLMAKPLHL SVEPRLAPNL YPKFRPGGGG SMKPKTKRYK LPPGPSPLPV IGNLLQLQKL NPQRFFAGWA KKYGPILSYR IGSRTMVVIS SAELAKELLK TQDVNFADRP PHRGHEFISY GRRDMALNHY TPYYREIRKM GMNHLFSPTR VATFKHVREE EARRMMDKIN KAADKSEVVD ISELMLTFTN SVVCRQAFGK KYNEDGEEMK RFIKILYGTQ SVLGKIFFSD FFPYCGFLDD LSGLTAYMKE CFERQDTYIQ EVVNETLDPK RVKPETESMI DLLMGIYKEQ PFASEFTVDN VKAVILDIVV QGTQQAAAAV VWGMTYLMKY PQVLKKAQAE VREYMKE
KGSTFVTEDD VKNLPYFRAL VKETQRIEPQ QPLLQPRACI QDTKIAGYDI PAGTTVNVNA WAVSRDEKEW GPNPDEFRPE RFLEKEVDQQ GTDYEFIQQG SGRRQQPQQQ LQQAMQEVPY ANLLLLSFNF KLPNGMKPDD INMDVMTGQQ MHKSQHLKLV PEKVNKY

#The two enzyme substrates are docked at the two parts of the linker respectively, and their spatial positions are close

Analyze the transformation results


  1. Use the Affinity function of AutoDock molecular docking to score affinity.
  2. Calculate the affinity improvement rate of the modified enzyme:
  3. $${Affinity\,Improvement\,Rate}=\cfrac{Aff._{imp}-Aff._{ori}}{Aff._{ori}}\times100\%$$
  4. Obtain RMSD data in molecular docking and evaluate stability
  5. Fig.3 RMSD diagrams of cyp79f1_short and cyp83a1_short

    The RMSD (Root Mean Square Deviation) graph is used to evaluate the stability of molecular structures in molecular dynamics simulations. The RMSD graph shows the degree of change of the molecule relative to the initial structure during the simulation. If the RMSD value remains low and stable during the simulation time, it means that the system is stable; if the RMSD value fluctuates significantly, it means that the system may still be undergoing structural adjustments or is unstable.

    If the RMSD value is low and stable, it means that the system has changed little relative to the initial structure during the simulation and the system is stable. This usually means that the molecular structure has not undergone significant changes during the simulation and the system has reached equilibrium.

    If the RMSD value is high and fluctuates, it may mean that the molecule has undergone large conformational changes during the simulation and the system may not have reached a stable state, or the system has poor stability.

    By calculating the mean and variance of the RMSD value, the volatility is compared. In the early stage of the simulation, each group of parameters has not reached a stable value.After 200psThe simulation becomes stable after calculation.

    Stability improvement rate calculation:

    $${Stability\,improvement\,rate}=\cfrac{\sigma_{imp}-\sigma_{ori}}{\sigma_{ori}}\times100\%$$
  6. The following table summarizes the improvement rates of affinity and stability:
  7. Affinity improvement rate Stability improvement rate
    79short-alpha3-op1 -2.17% 36.41%
    79short-alpha3-op2 0 0.79%
    79short-swiss-op1 -10.20% 59.74%
    79short-swiss-op2 0 31.22%
    83short-alpha3-op1 5.13% -44.50%
    83short-swiss-op1 7.69% 29.46%
    83short-alpha3-op3 -48.72% -41.52%
    linker-op1-aldoxime -21.62% 35.78%
    linker-op1-dihomomethionine -2.38% 65.92%
  8. Conjunct 79short-swiss-op2 and 83short-swiss-op1 with GGGGS linker, then re-simulate, it turns out that two enzymes’ substrates are docked at the two parts of the linker respectively, and their spatial positions are close. It is obvious from the sight that the optimized enzyme complex has a better performance than the original one.
  9. Fig.4 Contrast of the origin and optimization

Conclusion


The two yellow-marked optimizations in the table above are the two optimizations with the best simulation results. The two substrates were coupled with a linker. Before the modification, the two substrates overlapped with each other and were both on the same single protein. After the modification, the two substrates were docked on respective protein and their spatial positions were close, which was conducive to the continuous progress of related reactions.

Back to Top