...
Create a new job script
sleepy.sh
with the contents below:Code Block language bash title sleepy.sh #!/bin/bash sleep 120
Submit
sleepy.sh
to the batch system and check its status. Once it is running, cancel it and inspect the output.Expand title Solution You can submit your job with:
No Format sbatch sleepy.sh
You can then check the state of your job with squeue:
No Format squeue -j <jobid>
if you use the
<jobid>
of the job you just submitted, or just:No Format squeue --me
to list all your jobs.
To cancel your job, just run scancel:
No Format scancel <jobid>
If you inspect the output file from your last job, you will see a message like the following:
No Format slurmstepd: error: *** JOB 64281137 ON ab6-202 CANCELLED AT 2023-10-25T15:40:51 ***
Can you get information about the jobs you have run so far today, including those that have finished already?
Expand title Solution When jobs finish, they will not appear in the
squeue
output any longer. You can then check the Accounting Database with sacct:No Format sacct
With no arguments, this command will show you the list of all jobs run by you on this day.
In the output you may see or more entries 3 entries such as:
No Format JobID JobName QOS State ExitCode Elapsed NNodes NodeList ------------ ---------------- --------- ---------- -------- ---------- -------- -------------------- ... 64281137 sleepy.sh ef CANCELLED+ 0:0 00:00:16 1 ab6-202 64281137.ba+ batch CANCELLED 0:15 00:00:17 1 ab6-202 64281137.ex+ extern COMPLETED 0:0 00:00:16 1 ab6-202
The first one corresponds to the job itself. The second one (always named batch) corresponds to the actual job script and the third (named extern) corresponds to the external step used to generate the end of job information. You may have more lines if your job contains more steps, which typically correspond to srun parallel executions.
If you want to list just the entry for the job itself, you can do:
No Format sacct -X
Can you get information of all the jobs run today by you that were cancelled?
Expand title Solution You can filter jobs by state with the -s option. But If if you run it naively:
No Format sacct -X -ts CANCELLED
You will get no output. That is because when using state you must also specify the start and end times of your query period. You can then do something like:
No Format sacct -X -s CANCELLED -S $(date +%Y-%m-%d) -E $(date +%Y-%m-%dT%H:%M:%S)
The default information shown on the screen when querying past jobs is limited. Can you extract the submit, start, and end times of your cancelled jobs today? What about their output and error path? Hint: use the corresponding man page for all the options.
Expand title Solution You can use the following command to see all the possible output fields you can query for:
No Format sacct -e
While there are dedicated fields for the job submit, start and end times, there is none for the output and error paths. However, the AdminComment field is used to carry that information. Since it is a long field, you may want to pass a length to the fieldname to avoid truncation:
No Format sacct -X -s CANCELLED -S $(date +%Y-%m-%d) -E $(date +%Y-%m-%dT%H:%M:%S) -o jobid,jobname,state,submit,start,end,AdminComment%150
or you can also ask for a parsable output:
No Format sacct -X -s CANCELLED -S $(date +%Y-%m-%d) -E $(date +%Y-%m-%dT%H:%M:%S) -o jobid,jobname,state,submit,start,end,AdminComment -p
...