You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 2 Next »

Reference Documentation

Let's explore how to use the Slurm Batch System to the ATOS HPCF or ECS.

Basic Job submission

Access the default login node of the ATOS HPCF or ECS.

  1. Create a directory for this tutorial so all the exercises and outputs are contained inside:

    mkdir ~/batch_tutorial
    cd ~/batch_tutorial
  2. Create and submit a job called simplest.sh with just default settings that runs the command hostname. Can you find the output and inspect it? Where did your job run?

    Using your favourite editor, create a file called simplest.sh with the following content

    simplest.sh
    #!/bin/bash
    hostname

    You can submit it with:

    sbatch simplest.sh

    The job should be run shortly. When finished, a new file called slurm-<jobid>.out should appear in the same directory. You can check the output with:

    $ cat $(ls -r1 slurm-*.out | head -n1)
    ab6-202.bullx
    [ECMWF-INFO -ecepilog] ----------------------------------------------------------------------------------------------------
    [ECMWF-INFO -ecepilog] This is the ECMWF job Epilogue
    [ECMWF-INFO -ecepilog] +++ Please report issues using the Support portal +++
    [ECMWF-INFO -ecepilog] +++ https://support.ecmwf.int                     +++
    [ECMWF-INFO -ecepilog] ----------------------------------------------------------------------------------------------------
    [ECMWF-INFO -ecepilog] Run at 2023-10-25T11:31:53 on ecs
    [ECMWF-INFO -ecepilog] JobName                   : simplest.sh
    [ECMWF-INFO -ecepilog] JobID                     : 64273363
    [ECMWF-INFO -ecepilog] Submit                    : 2023-10-25T11:31:36
    [ECMWF-INFO -ecepilog] Start                     : 2023-10-25T11:31:51
    [ECMWF-INFO -ecepilog] End                       : 2023-10-25T11:31:53
    [ECMWF-INFO -ecepilog] QueuedTime                : 15.0
    [ECMWF-INFO -ecepilog] ElapsedRaw                : 2
    [ECMWF-INFO -ecepilog] ExitCode                  : 0:0
    [ECMWF-INFO -ecepilog] DerivedExitCode           : 0:0
    [ECMWF-INFO -ecepilog] State                     : COMPLETED
    [ECMWF-INFO -ecepilog] Account                   : myaccount
    [ECMWF-INFO -ecepilog] QOS                       : ef
    [ECMWF-INFO -ecepilog] User                      : user
    [ECMWF-INFO -ecepilog] StdOut                    : /etc/ecmwf/nfs/dh1_home_a/user/slurm-64273363.out
    [ECMWF-INFO -ecepilog] StdErr                    : /etc/ecmwf/nfs/dh1_home_a/user/slurm-64273363.out
    [ECMWF-INFO -ecepilog] NNodes                    : 1
    [ECMWF-INFO -ecepilog] NCPUS                     : 2
    [ECMWF-INFO -ecepilog] SBU                       : 0.011
    [ECMWF-INFO -ecepilog] ----------------------------------------------------------------------------------------------------
  3. Configure your simplest.sh job to direct the output to simplest-<jobid>.out, the error to simplest-<jobid>.err both in the same directory, and the job name to just "simplest". Note you will need to use a special placeholder for the -<jobid>.

    Using your favourite editor, open the simplest.sh job script and add the relevant #SBATCH directives:

    simplest.sh
    #!/bin/bash
    #SBATCH --job-name=simplest
    #SBATCH --output=simplest-%j.out
    #SBATCH --output=simplest-%j.err
    hostname

    You can submit it again with:

    sbatch simplest.sh

    After a few moments, you should see the new files appear in your directory (job id will be different than the one displayed here):

    $ ls simplest-*.*
    simplest-64274497.err  simplest-64274497.out
  • No labels