Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Slurm is the batch system available. Any script can be submitted as a job with no changes, but you might want to see Writing SLURM jobs to customise it.

To submit a script as a serial job with default options enter the command:

...

Note

Currently the "scancel" command shall be executed on the login node of the same cluster where the job is running.

See the Slurm documentation for more details on the different commands available to submit, query or cancel jobs.

...

QoS nameTypeSuitable for...Shared nodes Maximum jobs per userDefault / Max Wall Clock LimitDefault / Max CPUsDefault / Max Memory
nffractionalserial and small parallel jobs. It is the defaultYes-average runtime + standard deviation / 2 days1 / 648 GB / 128 GB
niinteractiveserial and small parallel interactive jobsYes112 hours / 7 days1 / 328 GB / 32 GB
npparallelparallel jobs requiring more than half a nodeNo-average runtime  + standard deviation / 2 days-240GB / 240 GB per node (all usable memory in a node)


Show If
groupecmwf

GPU special Partition

On the AC complex there is also the ng queue that gives access to the special partition with GPU-enabled nodes. See HPC2020: GPU usage for AI and Machine Learning for all the details on how to make use of those special resources.

Excerpt


QoS nameTypeSuitable for...Shared nodes Maximum jobs per userDefault / Max Wall Clock LimitDefault / Max CPUsDefault / Max Memory per node
ngGPUserial and small parallel jobs. It is the defaultYes-average runtime + standard deviation / 2 days1 / -8 GB /  500 GB



ECS

For those using ECS, these are the different QoS (or queues) available for standard users of this service:

...