molecular dynamics – How to manage disk space for Gromacs XTC (trajectory) file output

  1. Is there an option in gmx mdrun to specify where to write the .xtc file? For example I want to specify specific directory path with large disk space.

When you use the -deffnm md argument to gmx mdrun, it is assumed that all of the files will have the base name of md, so gromacs looks for md.tpr as input, and it writes md.xtc, md.edr etc. as output files in the current directory. This is not made clear in the documentation unfortunately.

To write the output files to other directories, you need to mention them explicitly. So, you need to run:

gmx mdrun -s md_01.tpr [-x /path/md_01.xtc] [-o md_01.trr] -c mdout.gro -g md_01.log -e md_01.edr

The -s argument supplies the binary topology input file (.tpr), the -x argument is for the compressed output, and -o for uncompressed output, so use whichever you selected in the .mdp file. The -c, -g and -e arguments provide the files for output coordinates, logfile and energy output file respectively. You can supply a file path for each of these coordinates.

If you do not provide any of these arguments the default names would be used and the files would be written to the current working directory. The default names can be found here.

  1. Is there away to shrink the size of .xtc file?

This is difficult, there are three ways you can reduce the size of the file, and you have to change the MD settings by modifying the .mdp file.

One way is to change the frequency of writing the coordinates to the file i.e. increase the number of steps between writing to .xtc file by changing the value of the parameter nstxout-compressed in the .mdp file.

Another way is to decrease the precision of the coordinates written, using the parameter compressed-x-precision in the .mdp file. The default value is 1000, which signifies 3-decimal point precision, so you can decrease that to 100 i.e. 2-decimal points.

There is also the setting compressed-x-grps which allow you to only write coordinates of a specific group into the trajectory file. This might be useful if you have for instance, a protein in solvent, and you only need information about the protein. Here you can find an example where only the protein coordinates are being written to the .xtc file.

A detailed documentation of all of the .mdp options can be found here.

As an aside, an 143 GB output file for an 1 ns simulation seems unusually large. Your system likely has a large number of atoms, and you are storing coordinates every step. Usually, it is not necessary to do this. Consider the kind of analysis you need, and then adjust the saving frequency accordingly. For example, if you save coordinates each 2 ps ($equiv2000$ steps if each timestep = 1 fs), then already you have 500 frames saved in your trajectory, which is sufficient for most common properties that can be analysed from a trajectory.

Read more here: Source link