Page History

Slurm is the batch system available. Any script can be submitted as a job with no changes, but you might want to see Writing SLURM jobs to customise it.

To submit a script as a serial job with default options enter the command:

...

Note
Currently the "scancel" command shall be executed on the login node of the same cluster where the job is running.

See the Slurm documentation for more details on the different commands available to submit, query or cancel jobs.

...

QoS name	Type	Suitable for...	Shared nodes	Maximum jobs per user	Default / Max Wall Clock Limit	Default / Max CPUs	Default / Max Memory
nf	fractional	serial and small parallel jobs. It is the default	Yes	-	average runtime + standard deviation / 2 days	1 / 64	8 GB / 128 GB
ni	interactive	serial and small parallel interactive jobs	Yes	1	12 hours / 7 days	1 / 32	8 GB / 32 GB
np	parallel	parallel jobs requiring more than half a node	No	-	average runtime + standard deviation / 2 days	-	240GB / 240 GB per node (all usable memory in a node)

Show If

group	ecmwf

GPU special Partition

On the AC complex there is also the ng queue that gives access to the special partition with GPU-enabled nodes. See HPC2020: GPU usage for AI and Machine Learning for all the details on how to make use of those special resources.

Excerpt

QoS name	Type	Suitable for...	Shared nodes	Maximum jobs per user	Default / Max Wall Clock Limit	Default / Max CPUs	Default / Max Memory per node
ng	GPU	serial and small parallel jobs. It is the default	Yes	-	average runtime + standard deviation / 2 days	1 / -	8 GB / 500 GB

ECS

For those using ECS, these are the different QoS (or queues) available for standard users of this service:

...

Space shortcuts

Page tree

Versions Compared

Old Version 15

New Version Current

Key

GPU special Partition

ECS