cluster computing – Is it possible to execute post-script after slurm job execution?

The short answer is that there is no such option in Slurm.

If post-script.sh can run on a compute node, the best option would be

  • if it is short: to add it at the end of the job submission script
  • if it is long; to submit it in its own job and use --dependency options to make start at the end of the first job.

If you have root privileges, you can use strigger to run post-script.sh after the job has completer. That would run on the slurmctld server.

If the post-script.sh must run on the login node, for external network access for instance, then the options first mentioned would work if you are able/allowed to SSH from a compute node to a login node. This is sometimes prevented/forbidden, but if not, then you can run ssh login.node bash post-script.sh at the end of the submission script or in a job of itself.

If that is not a possibility, then “busy polling” is indeed needed. You can do it in a Bash loop making sure not to put too large a burden on the Slurm server (every 5 minutes is OK, every 5 seconds is useless and harmful to the system).

You can also use a dedicated workflow management tool such as Maestro that will allow you to define a job and a dependent task to run on the login node.

See some general information about workflows on HPC systems here.

Read more here: Source link