I have set RealMemory=58000 (though they are 64G instances) in the file slurm_parallelcluster_queue_partition.conf
When I run the following commands:
sbatch -N 1 -n 1 –mem=40000 –wrap=”srun script.sh first_task”
sbatch -N 1 -n 1 –mem=40000 –wrap=”srun script.sh second_task” and more
They will be assigned to single instance which lead to ‘Cannot allocate memory’ in Java. I assume they should be assigned to two instances since the total required memory already exceed the RealMemory.
Do I misunderstand something? Or do you have any suggestions on this unexpected behavior?
Also, these tasks won’t be assigned to a 32G instance, so the RealMemory and –mem should already work.
Thanks very much.
Read more here: Source link