A Beginner’s Guide to Perform Molecular Dynamics Simulation of a Membrane Protein using GROMACS — GROMACS tutorials https://tutorials.gromacs.org documentation

Building the protein-membrane system in CHARMM-GUI

We are now ready to embed the protein structure in the membrane in the
proper location and orientation and construct the membrane composition
we desire. To do this, we utilized the CHARMM-GUI input Generator, a
handy web-based tool to generate GROMACS inputs for the protein-membrane
system. In order to access CHARMM-GUI (www.charmm-gui.org), a web
browser such as Chrome, Firefox, etc. is required.

In this Tutorial, rather than providing an in-depth discussion of how to
create input via CHARMM-GUI, we will focus on how to perform a correct
simulation using GROMACS. However, you can find useful information on
Membrane Builder
Intro

and in this Membrane Proteins
Tutorial

.

CHARMM-GUI generates all required input files as a zip file. You can
unzip the file using the command below, or directly use the previously
unzipped folder named “charmm-gui-1MAL”. This folder contains a diverse
set of files.

## command to unzip the zip file. The command is executable when you remove the hashtags
#import shutil
#shutil.unpack_archive('input/charmm-gui-1MAL.zip')
#print("Archive file unpacked successfully.")
#ls input

What we require, though, is in a “gromacs” folder. Let’s take a closer
look:

%ls input/charmm-gui-1MAL/gromacs/

In a prior tutorial titled “Introduction to Molecular
Dynamics”
,
you may get an explanation for each file including structure and
topology files. Some of these mdp files may need to be changed to fit
the system as further discussed in the next section.

Now lets create a new folder, called “run”. Here we will perform the
minimization and equilibration steps. Ensure that you are always in the
correct working directory, you can use the pwd command, which stands
for “print working directory”. The command can be used to determine what
directory you are currently working in.

# Creat working directory named run
%mkdir run
# Change to the directory "run"
%cd run
# Make sure you are in the right folder. This will show you the path :)
!pwd

Modified MDP files

Before starting the simulations we need the topology, parameter, and
structure files. We will copy them from the CHARMM-GUI-1MAL folder
to the run directory. Please note that mdp files are customized to
fit the requirements of the system and they are located in the
input/mdp folder.

##copy required files to a folder
!scp    ../input/charmm-gui-1MAL/gromacs/step5_input.gro          .
!scp    ../input/charmm-gui-1MAL/gromacs/step5_input.pdb          .
!scp    ../input/charmm-gui-1MAL/gromacs/topol.top                .
!scp    ../input/charmm-gui-1MAL/gromacs/index.ndx                .
!scp -r ../input/charmm-gui-1MAL/gromacs/toppar                   .
#updated mdp files
!scp    ../input/mdp/*.mdp                                        .

Because GROMACS always assumes a default value for each parameter,
simulations may be conducted with an entirely empty mdp file. Hence, you
must double-check each option to ensure that the default settings are
acceptable. If not, you can modify the mdp file. GROMACS is
user-friendly and prepared an intriguing list of possibilities that is
clearly explained in mdp options
webpage
.
To make your life easier, in all mdp files of this tutorial, a small
explanation is included next to each parameter.

Now we have all required files to start minimization and equilibration
parts:

Lets start! Although it will take time, you can run the commands below
one by one to generate files in every step. Fortunately, we have
prepared output of runs ready to use if you prefer the quick route (see
input/reference folder).

Energy minimisation

Energy minimisation (EM) is used to relax the protein-membrane system by
removing steric conflicts and improper geometry, e.g. to reduce
excessive forces caused by too-close particle interaction in the system.
We will first generate a binary (.tpr) file which assemble the
structure(.gro), topology(.top), and simulation parameters(.mdp). Then
we run energy minimization using gmx mdrun.

But first we go through the parameters in the minimization.mdp file:

#revised minimization.mdp
!cat ../input/mdp/step6.0_minimization.mdp
## EM-step1  > to generate binary file
!gmx grompp -f step6.0_minimization.mdp -o minimization.tpr -c step5_input.gro  -r step5_input.gro -p topol.top

Note: From now on, if you want to run the commands on your system
consider cells labelled with an odd number ##{number]## and remove
one hashtags from each line. But, if you want to use pre-prepared
outputs remove hashtags of cells with even numbers.

##1## EM-step2 > to invoke mdrun
#!gmx mdrun -v -deffnm minimization

##The command is executable when you remove one of the hashtags

Five files (tpr,gro,trr,edr,log) are generated after you run the
minimization command. You can always find the previously prepared files
for each step in the reference folder:

##2## EM-step2 > To transfer minimization output from the reference directory to working directory

#!scp ../reference/minimization* .

##remove one hashtag to transfer pre-prepared minimization files to working directory
#folder of pre-prepared outputs
%ls minimization*

Lets check if the minimization criteria are reached by looking at the
last lines of the log file:

!tail -19 ../reference/minimization.log

There are two very important factors to evaluate to determine if energy
minimization was successful. The first is the potential energy (printed
at the end of the EM process, even without -v). The second important
feature is the maximum force, Fmax, the target for which was set in
minim.mdp – “emtol = 1000.0” – indicating a target Fmax of no greater
than 1000 kJ/(mol nm). It is possible to arrive at a reasonable Epot
with Fmax > emtol. If this happens, your system may not be stable enough
for simulation. Evaluate why it may be happening, and perhaps change
your minimization parameters (integrator, emstep, etc).

## Minimized structure
import nglview as ng
view = ng.show_structure_file("../reference/minimization.gro")
## Lipids
view.add_representation("ball+stick",selection="POPC")
## water
view.add_representation("licorice",selection="TIP3P", opacity=0.9)
view.center()
view
#Click and drag to rotate, zoom with your mouseweel
#For more information on this viewer have a look at https://github.com/nglviewer/nglview

In alternative, remove the comment character (#) to use VMD:

#!vmd ../reference/minimization.gro

Let’s do a bit of analysis. The minimization.edr file contains all of
the energy terms that GROMACS collects during EM. To extract energy
information you can use the tool gmx energy

!printf "Potential\n0\n" | gmx energy -f minimization.edr -o potential.xvg -xvg none
import pandas as pd
df = pd.read_csv('potential.xvg', sep='\s+', header=None, names=['step','energy(kJ Mol-1)'])
df.plot('step')

Note: to analyse or visualize simulation data in python or jupyter
notebooks, we can output a simplified xvg format from gmx-analysis tools
with the option -xvg none

Equilibration run – NVT

position restraints

For free proteins in solution, e.g. cytosolic proteins, position
restraints are not as crucial as membrane proteins, where different
parts of a system must be restrained in different levels and released
gradually in several steps of minimization and equilibration. Otherwise
you will end up in serious problems such as undesired change in system
size and conformation.

To set position restraints we need to use the define option in the
simulation parameter file, .mdp, which switch on by -DPOSRES flag. Have
a look at the .mdp files for the following runs:

!head -4 ../input/mdp/step6.1_equilibration_NVT_step1.mdp
!head -4 ../input/mdp/step6.2_equilibration_NVT_step2.mdp
!head -4 ../input/mdp/step6.3_equilibration_NPT_step1.mdp
!head -4 ../input/mdp/step6.4_equilibration_NPT_step2.mdp
!head -4 ../input/mdp/step6.5_equilibration_NPT_step3.mdp
!head -4 ../input/mdp/step6.6_equilibration_NPT_step4.mdp
!head -4 ../input/mdp/step7_production_revised.mdp

Let’s use chain A (segment PROA) as an example to demonstrate how it
works. The position restrain is defined in the last part of the itp
file. DPOSRES_FC_BB and POSRES_FC_SC specify the type of
restrains that apply to the different part of the protein. BB stands for
the potein BackBone and SC for the protein Side Chains! Please
execute the command below to see the position_restraints set for chain
A:

!sed -n '59914,59932p'  ../input/charmm-gui-1MAL/gromacs/toppar/PROA.itp

Hint: The number of atoms may be misleading. Please follow the PDB file
to ensure which atoms are selected. Here you can find the atom numbers
for the first two residues:

!head -n 30 ../input/charmm-gui-1MAL/gromacs/step5_input.pdb

As you can see, the 1st and 5th atoms are N and CA protein backbone
atoms of the first residue, respectively, and the value is defined by
POSRES_FC_BB in PROA.itp file. The 7th and 9th are atom indexes of
CB and CG which belong to the protein side chain and specify with
POSRES_FC_SC. Take some moment to find the rest of the atoms and
understand the concept by comparing itp and pdb files.

The P atoms of the lipid headgroups are constrained by
DPOSRES_FC_LIPID. As the value is only determined for the Z
direction, POPC lipid molecules can only move in the X-Y plane of the
bilayer, maintaining the bilayer’s thickness. i is the atom index of
the P atom of the first POPC.

!sed -n '1271,1276p'  ../input/charmm-gui-1MAL/gromacs/toppar/POPC.itp

In POPC lipid molecules, two dihedral angles are typically restrained to
preserve the cis-isomer. Here, the angles formed by C1-C3-C2-O21 and
C28-C29-C210-C211 atoms are described by the atom indexes of the first
POPC 25-36-28-30 and 60-63-65-67, respectively, and the value is defined
by DIHRES_FC.

!sed -n '1278,1285p'  ../input/charmm-gui-1MAL/gromacs/toppar/POPC.itp

Equilibration run – NVT

To perform NVT equilibration we need two steps. In each step we will
remove the restrains gradually to avoid undesired change in system size
and conformation.

As discussed before, by removing the first hashtag (#) from the
beginning of each line in odd-numbered cells, you may “uncomment” and
execute them on your system, but be warned that it will take a while. As
an alternative, you can transfer the pre-prepared outputs by executing
even-numbered cells to be able to countinue.

##1## equilibration-NVT6.1-step1 > To run
#!gmx grompp -f step6.1_equilibration_NVT_step1.mdp -o step6.1_equilibration_NVT_step1.tpr -c minimization.gro -r step5_input.gro -p topol.top -n index.ndx
#!gmx mdrun -v -deffnm step6.1_equilibration_NVT_step1

As an alternative, you can transfer the pre-prepared outputs by
executing even-numbered cells to be able to countinue.

##2## equilibration-NVT6.1-step1 > To transfer
#!scp    ../reference/step6.1_equilibration_NVT_step1*  .

Now we run step 2 using results from step 1 as input

##3## equilibration-NVT6.2-step2 > To run
#!gmx grompp -f step6.2_equilibration_NVT_step2.mdp -o step6.2_equilibration_NVT_step2.tpr -c step6.1_equilibration_NVT_step1.gro -r step5_input.gro -p topol.top -n index.ndx
#!gmx mdrun -v -deffnm step6.2_equilibration_NVT_step2

As an alternative, you can transfer the pre-prepared outputs by
executing even-numbered cells to be able to countinue.

##4## equilibration-NVT6.2-step2 > To transfer
#!scp    ../reference/step6.2_equilibration_NVT_step2*  .
#Equilibration - NVT outputs
%ls ../reference/*NVT_step*

It is crucial to evaluate every step to ensure that the simulation is
going right. For this purpose, we should check the tempurature using
gmx energy tool. To plot the data, we used python data analysis
library (pandas), but there are several alternatives out there such as
Xmgrace tool.

Evaluate temperature

## check the temperature > To run
!echo 17|gmx energy -f step6.2_equilibration_NVT_step2.edr -o  NVT_S2-temp.xvg -xvg none
import pandas as pd
df = pd.read_csv('NVT_S2-temp.xvg', sep='\s+', header=None, names=['time(ps)','temperature(K)'])
df.plot('time(ps)')
##Alternative##Plotting data by Xmgrace tool
#!xmgrace ../run/NVT_S2-temp.xvg

Evaluate the final structure

It is also strongly recommended to visualise the final structure of each
step:

import nglview as ng
view = ng.show_structure_file("../reference/step6.1_equilibration_NVT_step1.gro")
## Lipids
view.add_representation("ball+stick",selection="POPC")
## water
view.add_representation("licorice",selection="TIP3P", opacity=0.9)
view.center()
view
#click and drag to rotate, zoom with your mouseweel
#for more information on this viewer have a look at https://github.com/nglviewer/nglview
import nglview as ng
view = ng.show_structure_file("../reference/step6.2_equilibration_NVT_step2.gro")
## Lipids
view.add_representation("ball+stick",selection="POPC")
## water
view.add_representation("licorice",selection="TIP3P", opacity=0.9)
view.center()
view

Inside the mdp files

As you may realize, the mdp files for EM and NVT are not the same. The
integrator has been updated to md:

!sed -n '8,8p'  ../input/mdp/step6.1_equilibration_NVT_step1.mdp
!sed -n '7,7p'  ../input/mdp/step6.0_minimization.mdp

And a temperature coupling section has been added:

!sed -n '42,55p' ../input/mdp/step6.1_equilibration_NVT_step1.mdp
## For more details and compare at the entire files remove one hashtag
#!cat ../input/mdp/step6.0_minimization.mdp
#!cat ../input/mdp/step6.1_equilibration_NVT_step1.mdp

We used the v-rescale thermostat, which is the modified version of
Berendsen having a stochastic term that assures the generation of a
correct canonical ensemble. Sometimes you need to define groups, for
instance to couple them separately. Here, SOLU_MEMB stands for protein
and membrane, whereas SOLV stands for water and ions. If you’re
wondering about how we suggest these names, go through the index files.
You can utilize pre-defined groups in the index file or even alter the
index file to establish more groups if necessary. Typically, CHARMM-GUI
provides five categories in the index file, as shown below:

Group

Index

Molecules

[ SOLU ]

1-19260

protein

[ MEMB ]

19261-39092

POPC

[ SOLV ]

39093-99055

ions and water

[ SOLU_MEMB ]

1-39092

protein and POPC

[ SYSTEM ]

1-99055

whole

The interpretation of this information is as follows:

!printf "q\n" | gmx make_ndx -f step5_input.gro -n index.ndx -o

You may also notice a variation in values if you look at the first lines
of NVT mdp files. The sole difference between the prior stage
(step6.1_equilibration_NVT_step1) and the final stage of NVT
(step6.2_equilibration_NVT_step2) is that position restrains are
gradually decreased and relaxed:

!head -6 ../input/mdp/step6.1_equilibration_NVT_step1.mdp
!head -4 ../input/mdp/step6.2_equilibration_NVT_step2.mdp

Equilibration run – NPT

Pressure equilibration takes place in an NPT ensemble. To keep the
pressure stable, we turn the barostat on. We chose the semisotropic
pressure coupling, which is isotropic in the x and y directions but not
in the z, and is suitable for membrane simulations. Because Berendsen
Barostat is outdated and Parrinello-Rahman is better to utilize ‘only’
in production runs when the structure is ‘properly equilibrated’, the
c-rescale algorithm is appropriate for this part:

Pressure coupling is on

pcoupl = c-rescale

pcoupltype = semiisotropic

tau_p = 5.0

compressibility = 4.5e-5 4.5e-5

ref_p = 1.0 1.0

refcoord_scaling = com

In order to minimize unintended changes in system size and conformation,
NPT equilibration is carried out in four phases, with each step
including a progressive removal of restraint.

##9## equilibration-NPT6.3-step1 > To run
#!gmx grompp -f step6.3_equilibration_NPT_step1.mdp -o step6.3_equilibration_NPT_step1.tpr -c step6.2_equilibration_NVT_step2.gro -r step5_input.gro -p topol.top -n index.ndx
#!gmx mdrun -v -deffnm step6.3_equilibration_NPT_step1

In alternative:

##10## equilibration-NPT6.3-step1 > To transfer
#!scp    ../reference/step6.3_equilibration_NPT_step1*  .

In the next phases, we gradually reduce the constraints:

##11## equilibration-NPT6.4-step2 > To run
#!gmx grompp -f step6.4_equilibration_NPT_step2.mdp -o step6.4_equilibration_NPT_step2.tpr -c step6.3_equilibration_NPT_step1.gro -r step5_input.gro -p topol.top -n index.ndx
#!gmx mdrun -v -deffnm step6.4_equilibration_NPT_step2

In alternative:

##12## equilibration-NPT6.4-step2 > To transfer
#!scp    ../reference/step6.4_equilibration_NPT_step2*  .
##13## equilibration-NPT6.5-step3 > To run
#!gmx grompp -f step6.5_equilibration_NPT_step3.mdp -o step6.5_equilibration_NPT_step3.tpr -c step6.4_equilibration_NPT_step2.gro -r step5_input.gro -p topol.top -n index.ndx
#!gmx mdrun -v -deffnm step6.5_equilibration_NPT_step3

In alternative:

##14## equilibration-NPT6.5-step3 > To transfer
#!scp    ../reference/step6.5_equilibration_NPT_step3*  .
##15## equilibration-NPT6.6-step4 > To run
#!gmx grompp -f step6.6_equilibration_NPT_step4.mdp -o step6.6_equilibration_NPT_step4.tpr -c step6.5_equilibration_NPT_step3.gro -r step5_input.gro -p topol.top -n index.ndx
#!gmx mdrun -v -deffnm step6.6_equilibration_NPT_step4

In alternative:

##16## equilibration-NPT6.6-step4 > To transfer
#!scp    ../reference/step6.6_equilibration_NPT_step4*  .

Here you can find the output of NPT steps:

##Equilibration - NPT outputs
%ls ../reference/*NPT_step*

Evaluate the Pressure

Attention! I hope you have not forget to evaluate the temperature in
each step so far. In NPT equilibration steps, we need to check at extra
parameter, pressure:

## check the pressure > To run
!echo 18|gmx energy -f step6.6_equilibration_NPT_step4.edr -o  NPT_S4_Press.xvg -xvg none
import pandas as pd
import statistics
df = pd.read_csv('NPT_S4_Press.xvg', sep='\s+', header=None, names=['time(ps)','pressure(bar)'])
df.plot('time(ps)')
##Alternative##Plotting data by Xmgrace tool
#!xmgrace NPT_S4_Press.xvg

As you can see, pressure fluctuate more than temprature which is totally
ok 🙂

The system’s temperature should rises to the desired level (~303 K) and
stay steady for the rest of the equilibration. While, pressure is the
parameter which fluctuates alot during the MD simulation, and its
average should be statistically comparable with the pre-defined pressure
(1 bar).

Production

Invoke mdrun for production (one single simulation)

The system is now properly equilibrated and reached the appropriate
temperature and pressure. The position restrictions may now be lifted,
to start the production MD and begin data gathering.For production run,
we will first generate the binary tpr filr using the following mdp
parameters:

!cat step7_production_revised.mdp
##19.1## step7-production > To run
##generate tpr file
#!gmx grompp -f step7_production_revised.mdp -o step7_production.tpr -c step6.6_equilibration_NPT_step4.gro -t step6.6_equilibration_NPT_step4.cpt -p topol.top -n index.ndx

The completion of a production run (cell 19.2) might take hours or days.
Dont execute this command here and try it on the terminal of the system
you choose to perform the simulation.

!Warning: This run will generate roughly 9028 Mb of data

##19.2## step7-production(one single simulation) > To run
#!gmx mdrun -s step7_production -cpi

A 100ns simulation was performed. Every 30000 of the 50000000 steps were
recorded as the trajectory frames. The trr file is compressed into an
xtc file. For convenience, we saved every 100 frames of the final
trajectory and put the outputs in the reference folder.

##20## step7-production > To transfer
#!scp    ../reference/step7_production*  .
##step7-production outputs
%ls *step7_production*

Visualize the final structure

import nglview as ng
view = ng.show_structure_file("../reference/step7_production.gro")
## Lipids
view.add_representation("spacefill",selection="POPC", opacity=0.5)
## to see water box uncommand below:
#view.add_representation("spacefill",selection="TIP3P")
view.center()
view

HINT How to generate replicas? You may need more than one
replica. You can generate several tpr file using random seed option for
the velocity and temperature in the ‘mdp’ file.

How to run several replicas? You dont need to perform simulations
separately, because GROMACS has a simple solution for that:

step 1: make folders (here we set four replicas) #mkdir Rep1 Rep2
Rep3 Rep4

step 2: copy each tpr file in a different folder and execute mdrun
by command below: #mpirun -np 4 gmx_mpi mdrun -v -s
step7_production.tpr -multidir Rep1 Rep2 Rep3 Rep4 -cpi


Quiz Why we need to run several replicas? Do you know how they
differ from one another? Exactly the same?

Analysis

RMSD analysis

!echo 4 4 |gmx rms -s step7_production.tpr -f step7_production_traj_comp_skip100.xtc -o production_rmsd.xvg -bin production.rmsd.dat -tu ns -fit rot+trans -xvg none
import pandas as pd
df = pd.read_csv('production_rmsd.xvg', sep='\s+', header=None, names=['time(ns)','RMSD(nm)'])
df.plot('time(ns)')

The root mean square deviation (RMSD) is a practical parameter to
compare the backbones of a protein from its initial to final state,
which illustrates the dynamics of structure during the simulation. Here,
the protein structure is pretty stable and fluctuates within expected
margins(1-2 Å).

##Alternative##Plotting data by Xmgrace tool
#!xmgrace production_rmsd.xvg

Energy

##check the energy
!echo 13|gmx energy -f step7_production.edr -o production_Etot.xvg -xvg none
import pandas as pd
df = pd.read_csv('production_Etot.xvg', sep='\s+', header=None, names=['time(ps)','Energy(kJ mol-1)'])
df.plot('time(ps)')
##Alternative##Plotting data by Xmgrace tool
#!xmgrace production_Etot.xvg

Tempurature

#check the tempurature
!echo 15|gmx energy -f step7_production.edr -o  production_temp.xvg -xvg none
import pandas as pd
df = pd.read_csv('production_temp.xvg', sep='\s+', header=None, names=['time(ps)','Temperature*(K)'])
df.plot('time(ps)')
##Alternative##Plotting data by Xmgrace tool
#!xmgrace production_temp.xvg

Pressure

#check the pressure
!echo 16|gmx energy -f step7_production.edr -o production_press.xvg -xvg none
import pandas as pd
df = pd.read_csv('production_press.xvg', sep='\s+', header=None, names=['time(ps)','Pressure(bar)'])
df.plot('time(ps)')
##Alternative##Plotting data by Xmgrace tool
#!xmgrace production_press.xvg

Trajectory Visualization

import nglview as ng
import pytraj as pt
Trajectory = pt.load('step7_production_traj_comp_skip100.xtc', top='step5_input.pdb')
view = ng.show_pytraj(Trajectory)
view.clear()
view.add_cartoon('protein',color_scheme='residueindex')
view

Now, place the mouse cursor over the protein press the play button. You
will see the protein wiggling and jiggling!

view.clear()
view.add_cartoon('protein',color_scheme='residueindex')
view.add_spacefill('not TIP and not protein', opacity=0.6)
view.center()
view.camera = 'orthographic'
view

In this Part, lipids (gray and red spheres), and ions (green and purpule
spheres) are illustrated. You can find the name of the atoms by simply
placing the cursor on it. Try to find chloride ions.

Easy ha? Now try the tutorial on your protein and let us know if it is
useful. if you have any question write in the forum and we will answer
as soon as possible: forums.gromacs.org

References 1.

Read more here: Source link