cluster computing – Is it possible to execute post-script after slurm job execution?

The short answer is that there is no such option in Slurm.

If can run on a compute node, the best option would be

  • if it is short: to add it at the end of the job submission script
  • if it is long; to submit it in its own job and use --dependency options to make start at the end of the first job.

If you have root privileges, you can use strigger to run after the job has completer. That would run on the slurmctld server.

If the must run on the login node, for external network access for instance, then the options first mentioned would work if you are able/allowed to SSH from a compute node to a login node. This is sometimes prevented/forbidden, but if not, then you can run ssh login.node bash at the end of the submission script or in a job of itself.

If that is not a possibility, then “busy polling” is indeed needed. You can do it in a Bash loop making sure not to put too large a burden on the Slurm server (every 5 minutes is OK, every 5 seconds is useless and harmful to the system).

You can also use a dedicated workflow management tool such as Maestro that will allow you to define a job and a dependent task to run on the login node.

See some general information about workflows on HPC systems here.

Read more here: Source link