A few resources to help enable using Distributed.jl on a SLURM cluster – Specific Domains

I created some resources for a few colleagues new to Julia, to enable them to use our SLURM cluster effectively, using existing packages like Distributed.jl and ClusterManagers.jl.

These scripts and resources do nothing novel or imaginative, and are more of an example of how one can use the existing packages, only abstracting away some details. The aim of the script is so that one can write code in a file using Distributed.jl (pmap, workers @distrbitued etc) in a way that will work locally on one’s own machine, and then one can use these resources to easily put one’s code onto a cluster, without changing the source code.

Additionally, it has some examples of how to allocate a GPU to each process, so each worker can run code using CUDA.jl.

All of the file are in a gist here:
gist.github.com/JamieMair/0b1ffbd4ee424c173e6b42fe756e877a




2 Likes

Read more here: Source link