Does openMP respect LD_PRELOAD? – User discussions

GROMACS version:2023.1
GROMACS modification: No

I’m currently profiling gromacs-2023.1 with CUDA acceleration. I’m trying to trace UVM page faults using NVIDIA nsight systems with a custom cudaMalloc shim library. It seems however, that GMX doesn’t interact with the CUDA API itself, and instead openMP forks threads which make the CUDA library calls. If that is truly the case, does openMP respect the LD_PRELOAD export for my shim library? Currently it doesn’t seem like the forked threads are utilizing my shim library. I’ve verified my cudaMalloc shim library works outside of gromacs.

Read more here: Source link