Explicit time limit honoured
From 2023-01-18 ECMWF will enforce killing jobs if they have reached their wall time if #SBATCH --time
or command line option -–time
was provided with the job.
Alternatively ECMWF accepts jobs without #SBATCH --time
or command line option -–time
and ECMWF will instead use average runtime of previous "similar" jobs by generating job tag based on user, job name, job geometry and job output path.
For more detail please refer to HPC2020: Job Runtime Management.
New maximum memory limit per node in parallel jobs
A maximum value of 240 GB of memory per node will be enforced from 2023-01-18. This will avoid potential out of memory situations, ensuring enough memory is left for the Operating System and other critical services running on those nodes.
Any parallel jobs explicitly asking for more than 240 GB with the #SBATCH --mem
directive or the --mem
command line option will fail at submission time. Since parallel jobs are assigned nodes exclusively, and therefore can use all the memory available in the nodes, it is usually easier to avoid defining that option all together in parallel jobs.