Unexpected behaviour during hybrid Monte-Carlo simulations (ReaxFF) – LAMMPS Beginners

Dear LAMMPS users,

I am trying to run Monte Carlo simulations using reaxff to study the oxygen coverage on a Pt nanoparticle. I defined my oxygen reservoir using the chemical potential definition for a perfect gas. Initially, the system is very small (350 atoms) and placed in a 40x40x40 box.

I want to explore the phase space of the Pt nanoparticle at a given temperature so I also use the “fix nvt” command.

At first, everything seems to go well: the temperature of the system is well equilibrated. But after a while, the dump file (.xyz) becomes unreadable because of bad printing: lines with strange characters, ghost atoms (duplicates sometimes), I even saw the addition of a platinum atom. It also seems that the average temperature shifts slightly down (3~5 K).

Sometimes the simulation will crash with “segmentation fault” or just go on.

I am not an expert in MD simulations nor is my group or my supervisor, I did my best to run something coherent, but knowing the complexity of reaxff forcefield the problem surely comes from me.

Any comment will be appreciated.

input: (I run this using 4 MPI processes)

boundary        f f f  # Periodicity doesn't really matter

units           real

atom_style      charge
read_data       mc.lmp_data

pair_style      reaxff NULL
pair_coeff      * * ./ffield.reax.pt_o_h Pt O

fix             q_equi all qeq/reax 1 0.0 10.0 1e-6 reax/c #Charge equilibration for reaxff

thermo_style    custom step temp press atoms density etotal pe ke
thermo          5

timestep        0.25

neighbor        2.0 bin #Default for units real
neigh_modify    every 5 delay 0 check yes #We build every time MC is called

minimize        1.0e-8 1.0e-9 100000 1000000 #Relax the nanoparticle

region          mc_in sphere 20 20 20 16 side in 
region          mc_out sphere 20 20 20 6 side out

region          mc intersect 2 mc_in mc_out #We limit the MC volume to the nanoparticle surface

restart         25000 gcmc.*.restart

group           gas type 2 #Oxygen
group           nano type 1 #Pt

velocity        nano create 350.0 335679138 dist gaussian #The pt nanoparticle will be sampled at 350K

compute_modify  thermo_temp dynamic yes #Necessary if the number of atom changes

compute         nano_temp nano temp #set NVT and uVT to the same temp
compute_modify  nano_temp dynamic/dof yes

fix             nanofix nano nvt temp 350.0 350.0 $(100.0*dt) # NVT for the nanoparticle
fix_modify      nanofix temp nano_temp

dump            gcmc_equi all xyz 50 lmp_gcmc.xyz 
dump_modify     gcmc_equi element Pt O

run             40000 #Small NVT equilibration for the nanoparticle

fix             gcmc_equi gas gcmc 5 25 25 2 147028938 350.0 -73.15115097 0.05 overlap_cutoff 0.5 full_energy region mc #MC is frequent otherwise the convergence might be slow

run             500000


Please always report which LAMMPS version you are using and what platform you are running on.

These are signs for memory corruption. There are two main causes for that:

  1. one or more bugs in the code
  2. faulty hardware (RAM or CPU) or insufficient cooling

To minimize the risk of a bug in the code, you should use the latest available LAMMPS version, this is currently version 23 June 2022.

To check for failing hardware you can try some independent testing tool like memtest86.

If the issue persists and cannot be attributed to weak hardware or insufficient cooling, then you should try to set up as small a system as possible that can reproduce it with as

xyz format dump files are usually a bad choice because they contain only limited information

It may still be advisable to use either “m m m” boundaries or apply reflective or soft repulsive wall fixes to avoid lost atoms.

Then you are advised to seek out some experienced collaborator(s). There are many subtle issues with MD in general and ReaxFF in particular and you cannot learn about those effectively from textbooks, published literature, or online forums.

Thank you for your answer,

The version I am using is a precompiled version (29 Sep 2021) from the supercomputer I use (Red Hat Enterprise Linux, Intel Classic C++ 20.21.2 / Intel(R) C++ g++ 4.8.5 mode with OpenMP 5.0 pr
eview 1). I’ve never had any issue of this sort before.

I compiled my own version (23 Jun 2022) on another supercomputer (SUSE Linux Enterprise Server, GNU C++ 10.2.0 20200723 (Cray Inc.) with OpenMP 4.5) with the help of the HPC support and was able to reproduce the issue.

Although I am not able to reproduce the issue so far on my laptop (Apple M1, ARM64, Homebrew version)

That’s what I thought but to be fair, given my little experience with MD I also thought that me doing a mistake is more likely than a bug in the code, that’s why I posted my input here to have it checked by experts.

Couldn’t it be because the system is small and unbalanced? (nanoparticle at the center of a box), some processors might have much more atoms than other?

For now I am trying to navigate with my (average) knowledge of statistical mechanics and common sense. I am trying to get knowledgeable collaborators without success so far.

What would be important is to have a stack trace – either from valgrind or from gdb – to locate in which part of the code the memory corruption happens.

Unlikely. How many processors do you use? Given the small size of the system, there is not much to gain from using many MPI ranks. It may be beneficial to use multi-threading instead. If there is memory corruption due to overparallelization, then it would be in the bond-order code of ReaxFF, and that can be adjusted with settings on the command line (safezone, mincap, minhbonds) and the risk minimized by compiling/using the (serial) KOKKOS version of the pair style.

The fact that you are using ReaxFF is a significant complication. Perhaps you should first try a simpler system, e.g. using EAM for a metal nanoparticle and some other element to deposit on it that can be modeled by lj/cut.

With the nanoparticle in the center of the box, you should be able to use 8 MPI ranks without having to worry about load imbalance and MPI ranks without atoms (further parallelization could then be obtained by using MPI+OpenMP). It would also be preferable to start from a system already (over)populated with oxygen atoms. For the memory management issues I mentioned, it is always better to remove atoms than to add. The number of atoms or pairs of atoms may only increase by a certain amount with the ReaxFF implementation in LAMMPS which assumes that the number of atoms or number of interactions does not change much.

I use 4 MPI processes, I tried multi-threading (2 Open-MP threads) which leads to a slight speed-up

I will have a look at that

Thank you for the advice

Me and my predecessor did try some Gupta potentials before with relative success (without oxygen also). This ReaxFF forcefield reproduces adsorption energies (and some energy barriers) with an accuracy that we cannot hope to get with simpler potentials. Without entering too much into details, the nanoparticle we use are “realistic” (non-perfect) shapes from experiments, with quite a lot of steps, kinks and ad-atoms, We found that simpler potentials are less accurate with these shapes sometimes leading to weird geometries. We hope that we can feed some of these geometries into higher level theories (ab-initio).

I didn’t mean to use those for research but as a means to build your skills. Both using fix gcmc and using ReaxFF are advanced simulation techniques and thus require more experience, care and practice. If you experiment with a system where you leave out one of the complications until you are more comfortable and confident in your choice of settings, you make it much easier to acquire the skills, to tell if something is wrong, and correct for it.

Read more here: Source link