You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Current »

Slurm is the batch system available. Any script can be submitted as a job with no changes, but you might want to see Writing SLURM jobs to customise it.

To submit a script as a serial job with default options enter the command:

sbatch yourscript.sh

You may query the queues to see the jobs currently running or pending with:

squeue

And cancel a job with

scancel <jobid>

The "scancel" command should be executed on a login node on the same cluster as the job.

See the Slurm documentation for more details on the different commands available to submit, query or cancel jobs.

QoS available

These are the different QoS (or queues) available for standard users on the four complexes:

QoS nameTypeSuitable for...Shared nodes Maximum jobs per userMaximum nodes per userDefault / Max Wall Clock LimitDefault / Max CPUsDefault / Max Memory per node
ngGPUserial and small parallel jobs with GPU. It is de defaultYes-4average runtime + standard deviation / 2 days1 / -8 GB /  500 GB
dgGPUshort debug jobs requiring GPUYes12average runtime + standard deviation / 30 min1 / -8 GB /  500 GB

Time limit management

See AG: Job Runtime Management for more information on how the default Wall Clock Time limit is calculated.

Limits are not set in stone

Different limits on the different QoSs may be introduced or changed as the system evolves.

Checking QoS setup

If you want to get all the details of a particular QoS on the system, you may run, for example:

sacctmgr list qos names=ng

Submitting jobs remotely

If you are submitting jobs from a different platform via ssh, please use the ag-batch dedicated node instead of the *-login equivalent

ssh ag-batch "sbatch myjob.sh"

  • No labels