Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...


This document describes the service that allows users to automatically submit jobs to be run when certain points in the daily ECMWF operational forecast suites have been reached. The main purpose is to ensure that certain data is available before e.g. submitting a MARS request. This facility is running using the ECaccess environment. It is available either through the Web interface of ECaccess or with the ECaccess Web Toolkit, available on the Atos HPC or installed locally.  This service is monitored by the operators at ECMWF.

...

The jobs to be attached to the ECMWF operational suite will have to be submitted through ECaccess. Batch job submission is available from the ECaccess web interface or through the ECaccess Web Toolkit. We will first look at the Web interface, then at the Web Toolkit.

...

Figure 1: Job submission - Upper part

The part to include the job script has not changed. You can either type in your job script in the scriptwindow provided, copy and paste it from another window or upload it from a local file. One important addition to make to your jobs is to add the ‘set "set -e’ e" command or alternatively to manage the errors in your jobs and exit accordingly - see see Job status, below, for more details.

...

The lower part of the job submission window (see Figure 2) - called subscription - allows you to attach your job to the different events available to you. Simply tick the boxes corresponding to the event(s) when you want to run your job.

...

No Format
set -e
mkdir $SCRATCHDIR/data

Of course, in this particular example, you could also use the mkdir  " -p " option:

Code Block
mkdir -p $SCRATCH/data 

Please note that you can submit your job to ECaccess without setting up what is suggested above. Your jobs will run normally but, without this job control, the ECMWF operators will not notice any errors with your jobs and ECaccess will fail to resubmit your jobs, even if you requested some retries.

...

will retry your job on failure 3 times with 15 minutes (300 900 seconds) between retries.  This will sometimes allow the job to complete successfully if the initial failure was caused by a temporary issue.

...

If you have to make some changes to any of your ECaccess Time Critical jobs, you will have to cancel the existing job in standby mode and submit the new version of the job. The job name shown with "ecaccess-job- list" or through the web interface should help you to identify the correct job to delete. Similarly, to remove a job from the system, you will have to remove the job in standby (STDBY) mode.

Job examples

An example of a Time-critical batch job for submission on the Atos to create various types of ENS meteogram plots is provided by realtime_metgram.sh.

...