Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: more explicit error for memory limit failure at submission

Explicit time limit honoured

From -01-18 ECMWF will enforce killing jobs if they have reached their wall time if #SBATCH --time or command line option -–time was provided with the job.

...

For more detail please refer to HPC2020: Job Runtime Management.

New maximum memory limit per node in parallel jobs

A maximum value of 240 GB of memory per node will be enforced from -01-18. This will avoid potential out of memory situations, ensuring enough memory is left for the Operating System and other critical services running on those nodes.

Any parallel jobs explicitly asking for more than 240 GB with the #SBATCH --mem  directive or the --mem command line option will fail at submission time with the message:

No Format
sbatch: error: Memory specification can not be satisfied
sbatch: error: Batch job submission failed: Requested node configuration is not available

. Since parallel jobs are assigned nodes exclusively, and therefore can use all the memory available in the nodes, it is usually easier to avoid defining that option all together in parallel jobs.