Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: Atos update - remove references to ecgate etc

...

Within this Framework, Member State users can use the general purpose server ecgate" and the High Performance Computing Facility (HPCF), provided they have access to the HPCF resources Atos HPCF and ECGATE services (HPCF resources are needed to use the HPCF / HPC servce). In general, users should minimise the number of systems they use. For example, they should use ecgate" the ECGATE (ECS) service only if they need to post-process data, which is not excessively computationally intensive. Similarly, they should only use the HPCF (HPC) if they need to run computationally intensive work (e.g. a numerical model) and do not need to post-process their output graphically before it is transferred to their Member State. Member State time-critical work may also need to use additional systems outside ECMWF after having done some processing at ECMWF: for example to run other models using data produced by their work at ECMWF. It is not the purpose of this document to provide guidelines on how to run work which does not make use of ECMWF computing systems.

...

Every registered user of ECMWF computing systems is allowed to run work using öption "option 1" of this Framework and no formal request is required. Note that the access to our realtime operational data is restricted. Users interested in running this kind of work should refer to the document entitled "Simple time-critical jobs - ECaccess". See http://software.ecmwf.int/wiki/display/USS/Simple+time-critical+jobs. To run work using option 2" or option 3" you will need to submit an official request to the Director of Forecasting Department at ECMWF, signed by the TAC representative of your Member State. Before submitting a request we advise you to discuss the time-critical work you intend to run at ECMWF with your User Support contact point. Your official request will need to provide the following information:

...

As this work will be monitored by ECMWF staff (User Support during the development phase; the operators, once your work is fully implemented), the only practical option is to implement time-critical work using a suite under ecFlow, ECMWF's monitoring and scheduling software packages. Given that SMS will gradually be phased out, we ask new developers of Option 2 activities to use ecFlow, see Bologna - New Data Centre. We will therefore only refer to ecFlow in the remaining part of this document. The suite must be developed according to the technical guidelines provided in this document. General documentation, training .The suite must be developed according to the technical guidelines provided in this document. General documentation, training course material, etc ... on ecFlow can be found at ecflow home. No on-call support will be provided by ECMWF staff but the ECMWF operators can contact the relevant Member State suite support person, if this is clearly requested in the suite man pages.

...

In this case, the Member State ecFlow suite will be managed by ECMWF. The suite will usually be based on either a suite previously developed in the framework of "option 2" or on a similar suite already used to run ECMWF operational work. The suite will be run using the ECMWF operational userid and will be managed by staff in the production section at ECMWF. The suite will generally be developed following similar guidelines to "option 2". The main technical differences are that option 3" will have higher batch scheduling priority than "option 2" work and that the ECPDS system (ECmwf Product Dissemination System) will normally be used to transfer products obtained by option 3" work. With option 3", your time-critical work will also benefit from first level on call support from the ECMWF Production Section staff.

...

A specific UID will be created to run a particular suite under "option 2". This UID will be set up as an "application identifier": such UIDs start with a "z", followed by two or three characters. No password will be assigned and access to the UID will be allowed using a strong authentication token (ActivIdentity token). A person responsible will be nominated for every "application identifier" UID. A limited number of other registered users can also be authorized to access this UID and a mechanism to allow such access under strict control will be available. The person associated with the UID and other authorised users have responsibility for all changes made to the files owned by the UID. The UID will be registered with a specific "policy" ("timecrit") which allows access to restricted batch classes, restricted file systems.

2.4  General ecFlow suite guidelines

As mentioned earlier, öption "option 2" work must be implemented, unless otherwise previously agreed, by developing an ecFlow suite. The ecFlow environment is not set up by default for users on ecgate or on the HPC the Atos HPCF and ECGATE systems. Users will have to load the ecFlow environment with a module: module load ecflow.  ECMWF will create a ready-to-go ecFlow server running on an independent Virtual Machine outside the HPCF. See also Using ecFlow for further information about using ecFlow on the Atos HPCF and ECGATE systems.

2.4.1  Port number and ecFlow server

The ecFlow port number for the suite has the format "1000+<UID>", where <UID> is the numeric UID number of the userid used to run the work. The script to start the ecFlow server is available on ecgate and is called ecflow_start.sh". A second ecFlow server can be started up for backup or development purposes. This second ecFLow server will be started with the '-b' option and will use the port number "500+<UID>". The syntax of the ecflow_start.sh command is:

Usage: /usr/local/apps/ecflow/4.0.6/bin/ecflow_start.sh [-b] [-d ecf_home directory] [-f] [-h]
           -b start ECF for backup server or e-suite
           -d <dir> specify the ECF_HOME directory - default /home/us/usl/ecflow_server
           -f forces the ECF to be restarted
           -v verbose mode
           -h print this help page
           -p <num> specify server port number(ECF_PORT number) - default 1000+<UID> - 500+<UID> for backup server

Note that the port number allocation convention doesn't guarantee that the two numbers associated with your UID are free. A port number may already be used by another user for ecFlow or it may be used by another application. If 'your' default port number is not free, you will have to start the ecFlow server by specifying your own port number, using the option '-p'. Authorised port numbers are between 1024 and 65536. We advise you to choose higher numbers. The ecFlow server will run on ecgate and can be started at system boot time. Please ask User Support at ECMWF if you want us to start your ecFlow server at boot time. A cron job which regularly checks the presence of the ecFlow server process should also be implemented. The above script ecflow_start.sh can also be used to run this check under cron, e.g. like in:

5,20,35,50 * * * *  /cronrun.ksh ecflow_start.sh 1> /ecFlow_start.out 2> 1

with the script $HOME/cronrun.ksh containing:

#!/bin/ksh
export PATH=/usr/local/bin:PATH.
~/.profile
~/.kshrc
module load ecflow
$@

Depending on your activity with ecFlow, the ecFlow log file (~/ecflow_server/ecgb.*.log) will grow steadily. We recommend that you install either a cron job or an administration task in your suite to clean these ecFlow log files. This can be achieved with the command ecflow_client: ecflow_client -port

2.4.2  Access to the job output files

We recommend the usage of the simple log server (Perl script) to access the output files of jobs running on the HPCF. This log server requires another port number, which will have the format "35000+<UID>", where <UID> is the numeric uid of the userid used to run the work. The log server will run on the HPCF and can be started after system boot. The script /usr/local/bin/start_logserver should be used to start the log server on the HPCF. The syntax of the command start_logserver is:

Usage: /usr/local/bin/start_logserver [-d <dir>] [-m <map>] [-h]
            -d <dir> specify the directory name where files will be served from - default is $HOME
            -m <map> give mapping between local directory and directory where ecFlow server runs - default is <dir>:<dir>
            -h print this help page

The mapping can consist of a succession of mappings. Each individual mapping will first give the directory name on the ecFlow server, followed by the directory name on the HPC system, like in the following example:

-m <dir_ecgate>:<dir1_hpc>:<dir_ecgate>:<dir2_hpc>

We recommend that you implement a cron job or define an administration task in your suite to check the presence of the log server process. The above script /usr/local/bin/start_logserver can be used for this purpose. Note that the job output files of running jobs on HPC are kept on a local spool, which is not visible from the interactive nodes (cca and ccb). In order to see the job output files of running jobs, you will therefore need to start the logserver on cca-log and ccb-log. See further for more details.

2.4.3  Managing ecFlow tasks

EcFlow will manage your jobs. Three main actions on the ecFlow tasks are required: one to submit, one to check and one to kill a task. These three actions are respectively defined through the ecFlow variables ECF_JOB_CMD, ECF_KILL_CMD and ECF_STATUS_CMD. You can use any script to take these actions on your tasks. We recommend that you use the commands provided by ECMWF with the schedule module which is available on ecgate. To activate the module, you will run: module load schedule The command called 'schedule' can then be used to submit, check or kill a task:

Usage: /usr/local/apps/schedule/1.4/bin/schedule <user> <host> [<requestid>] <jobfile> <joboutput> [kill - status]
 Command used to schedule some tasks to sms or ecflow
        <user>:         %USER%
        <host>:         %REMOTE_HOST%, %SCHOST%, %WSHOST%
        <requestid>:    %ECF_RID% or %SMSRID% (only needed for [kill|status])
        <jobfile>:      %ECF_JOB% or %SMSJOB%
        <joboutput>:    %ECF_JOBOUT% or %SMSJOBOUT%

...

will be the default 3141 and the host will be the Virtual Machine and will have a hostname of the form ecflow-tc2-<UID>-001.  The ecFlow server will run on Virtual Machine and will be started at system boot time.  You should not need to SSH into the server unless there is a problem. If the server died for some reason, it should be restarted automatically, but if it does not, you may restart it manually with:

No Format
$ ssh $ECF_HOST sudo systemctl restart ecflow-server

Depending on your activity with ecFlow, the ecFlow log file (/home/$USER/ecflow_server/ecflow-tc2-$USER-001.log) will grow steadily. We recommend that you install either a cron job or an administration task in your suite to clean these ecFlow log files. This can be achieved with the ecflow_client command:

No Format
$ ecflow_client --help log  

log
---

Get,clear,flush or create a new log file.
The user must ensure that a valid path is specified.
Specifying '--log=get' with a large number of lines from the server,
can consume a lot of **memory**. The log file can be a very large file,
hence we use a default of 100 lines, optionally the number of lines can be specified.
 arg1 = [ get | clear | flush | new | path ]
  get -   Outputs the log file to standard out.
          defaults to return the last 100 lines
          The second argument can specify how many lines to return
  clear - Clear the log file of its contents.
  flush - Flush and close the log file. (only temporary) next time
          server writes to log, it will be opened again. Hence it best
          to halt the server first
  new -   Flush and close the existing log file, and start using the
          the path defined for ECF_LOG. By changing this variable
          a new log file path can be used
          Alternatively an explicit path can also be provided
          in which case ECF_LOG is also updated
  path -  Returns the path name to the existing log file
 arg2 = [ new_path | optional last n lines ]
         if get specified can specify lines to get. Value must be convertible to an integer
         Otherwise if arg1 is 'new' then the second argument must be a path
Usage:
  --log=get                        # Write the last 100 lines of the log file to standard out
  --log=get 200                    # Write the last 200 lines of the log file to standard out
  --log=clear                      # Clear the log file. The log is now empty
  --log=flush                      # Flush and close log file, next request will re-open log file
  --log=new /path/to/new/log/file  # Close and flush log file, and create a new log file, updates ECF_LOG
  --log=new                        # Close and flush log file, and create a new log file using ECF_LOG variable


The client reads in the following environment variables. These are read by user and child command

|----------|----------|------------|-------------------------------------------------------------------|
| Name     |  Type    | Required   | Description                                                       |
|----------|----------|------------|-------------------------------------------------------------------|
| ECF_HOST | <string> | Mandatory* | The host name of the main server. defaults to 'localhost'         |
| ECF_PORT |  <int>   | Mandatory* | The TCP/IP port to call on the server. Must be unique to a server |
| ECF_SSL  |  <any>   | Optional*  | Enable encrypted comms with SSL enabled server.                   |
|----------|----------|------------|-------------------------------------------------------------------|

* The host and port must be specified in order for the client to communicate with the server, this can 
  be done by setting ECF_HOST, ECF_PORT or by specifying --host=<host> --port=<int> on the command line

For example, to empty the log file, use:

No Format
ecflow_client --port=%ECF_PORT% --host=%ECF_HOST% --log=clear

For information about using crontabs on the Atos HPCF and ECGATE service, please see Cron service. Note in particular that your crontab should be installed on either ecs-cron or hpc-cron.

2.4.2  Access to the job output files

We recommend that the job output files are stored on the Lustre $TCWORK file system.  As this file system cannot be directly accessed from the ecFlow Virtual Machine, we recommend the usage of the simple log server (Perl script) to access the output files of jobs running on the HPCF. This log server requires another port number, which will have the format "35000+<UID>", where <UID> is the numeric uid of the userid used to run the work. The log server will run on the hpc-log node of the Atos HPCF. The ecflow_logserver.sh command should be used to start the log server on the HPCF. The syntax of the command ecflow_logserver.sh is:

No Format
$ ecflow_logserver.sh -h
Usage: /usr/bin/ecflow_logserver.sh [-d <dir>] [-m <map>] [-l <logfile>] [-h]
       -d <dir>     specify the directory name where files will be served
                    from - default is $HOME
       -m <map>     gives mapping between local directory and directory
                    where ecflow server runs - default is <dir>:<dir>
       -l <logfile> logserver log file - default is $SCRATCH/log/logfile
       -h           print this help page
Example:
       start_logserver.sh -d %ECF_OUT% -m %ECF_HOME%:%ECF_OUT% -l logserver.log

The mapping can consist of a succession of mappings. Each individual mapping will first give the directory name on the ecFlow server, followed by the directory name on the HPC system, like in the following example:

No Format
-m <dir_ecflow_vm>:<dir1_hpc>:<dir_ecflow_vm:<dir2_hpc>

We recommend that you implement a cron job or define an administration task in your suite to check the presence of the log server process. The above script ecflow_logserver.sh can be used for this purpose. The logserver should be started on hpc-log.

2.4.3  Managing ecFlow tasks

EcFlow will manage your jobs. Three main actions on the ecFlow tasks are required: one to submit, one to check and one to kill a task. These three actions are respectively defined through the ecFlow variables ECF_JOB_CMD, ECF_KILL_CMD and ECF_STATUS_CMD. You can use any script to take these actions on your tasks. We recommend that you use the troika provided by ECMWF.  The "troika" command is installed on your ecFlow Virtual Machine and can be used to submit, check or kill a task:

No Format
$ troika -h
usage: troika [-h] [-V] [-v] [-q] [-l LOGFILE] [-c CONFIG] [-n] action ...

Submit, monitor and kill jobs on remote systems

positional arguments:
  action                perform this action, see `troika <action> --help` for details
    submit              submit a new job
    monitor             monitor a submitted job
    kill                kill a submitted job
    check-connection    check whether the connection works
    list-sites          list available sites

optional arguments:
  -h, --help            show this help message and exit
  -V, --version         show program's version number and exit
  -v, --verbose         increase verbosity level (can be repeated)
  -q, --quiet           decrease verbosity level (can be repeated)
  -l LOGFILE, --logfile LOGFILE
                        save log output to this file
  -c CONFIG, --config CONFIG
                        path to the configuration file
  -n, --dryrun          if true, do not execute, just report

environment variables:
  TROIKA_CONFIG_FILE    path to the default configuration file

We recommend you set your ecFlow variables to use troika as follows:

No Format
ECF_JOB_CMD="troika -vv submit -u %USER% -o %ECF_JOBOUT% %HOST% %ECF_JOB%"

ECF_KILL_CMD="troika kill -u %USER% %HOST% %ECF_JOB%"

ECF_STATUS_CMD="troika monitor -u %USER% %HOST% %ECF_JOB%"

2.4.4 EcFlow access protection

...

  1. The suite should easily run in a different configuration. It is therefore vital to allow for easy changes of configuration. Possible changes could include:
    1. Running on a different HPCF system.
    2. Running the main task on fewer or more CPUs, with fewer or more threads (if relevant).
    3. Using a different file system.
    4. Using a different data set, e.g. ECMWF e-suite or own e-suite.
    5. Using a different ”model” version.
    6. Using a different EcFlow server (while only ecgate is available to you, this is not relevant).
    7. Using a different UID and different queues, e.g. for testing and development purposes.

      The worst that could happen is that you lose everything and need to restart from scratch. Although this is very unlikely, you should keep safe copies of your libraries, executables and other constant data files. To achieve flexibility in the configuration of your suite, we recommend that you have one core suite and define ecFlow variables for all those changes of configuration you want to cater for. See variable definitions in suite definition file ~usx/time_critical/sample_suite.def.

  2. It is also important to clearly document the procedures for any changes to the configuration, if these may need to be run by, for example, the operators at ECMWF.

  3. All tasks that are part of the critical path, i.e. that will produce the final ”products” to be used by you, have to run in the safest environment:

    1. If possible, your time-critical tasks should run on the HPCF system. If this is impossible and your task runs on ecgate, be aware that this may block your time-critical activity, as currently there is no backup for this system.
    2. Your time-critical tasks should not use the Data Handling System (DHS), including ECFS and MARS. The data should be available online, on the HPCF (either in a private file system or in the MARS Fields Data Base (FDB). If some data must be stored in MARS or ECFS, do not make time-critical tasks dependent on these archive tasks, but keep them independent. See the sample ecFlow definition in ~usx/time_critical/sample_suite.def.
    3. Do not use cross-mounted file systems. Always use local file systems.
    4. To exchange data between remote systems, we recommend the use of rsync.

  4. The manual pages should include specific and clear instructions for the operators at ECMWF. An example man page is available from ~usx/ime_critical/suite/man_page. Man pages should include the following information:
    1. A description of the task.
    2. The dependencies on other tasks.
    3. What to do in case of failure.
    4. Whom to contact in case of failure, how and when.

  5. The ecFlow functionality of ”late tasks” is useful to draw the ECMWF operators’ attention to possible problems in the running of your suite. Try to set the functionality for a few key tasks only, with appropriately selected warning thresholds. If the functionality is used too frequently or if an alarm is triggered every day, it is likely that no one will pay attention to it.

  6. The suite should be self-cleaning. Disk management should be very strict and is your responsibility. All data no longer needed should be removed. The ecFlow jobs and job output files, if kept, should be stored (in ECFS), then removed from local disks.

  7. Your suite definition will loop over many dates, e.g. to cover one year. Depending on the relation between your suite and the operational activity at ECMWF, you will trigger (start) your suite in one of the following ways:
    1. If your suite depends on the ECMWF operational suite, you will set up a time-critical job under ECaccess (see option 1) which will simply set a first dummy task in your suite to complete. Alternatively, you could resume the suite, which would be reset to ”suspended” after completing a cycle. See sample job in ~usx/time_critical/suite/trigger_suite.cmd.
    2. If your suite has no dependencies with the ECMWF operational activity, we suggest you to define a time in your suite definition file when to start the first task in your suite.
    3. If your suite has no dependencies on the ECMWF operational activity, but has dependencies on external events, we suggest that you also define a time when to start the first task in your suite, and that you check for your external dependency in this first task.
    4. The cycling from day to day will usually happen by defining a time when the last task in the suite will run. This last task should run sufficiently long in advance before the next run will start. Setting up this time will allow you to watch the previous run of the suite up until the last task has run. See the sample suite definition in ~usx/time_critical/sample_suite.def.
      Note that if one task of your suite remains in aborted status, this will NOT prevent the last task to run at the given time but your suite will not be able to cycle through to the next run, e.g. for the next day. Different options are available to you to overcome this problem. If the task that failed is not in the critical path, you can give instructions to the operators to set the aborted task to complete. Another option would be to build an administrative task that checks before each run that all tasks are set to complete, and therefore forces your suite to cycle through to the next run.

One key point in the successful communication between the jobs running on the HPCFs HPCF systems or ecgate and your ecFlow server is the error handling. We recommend the use of a trap, as illustrated in the sample suite in ~usx/time_critical/include/head.h. The shell script run by your batch job should also use the ”set -ue” options.

...

A sample suite illustrating the previous recommendation is available in  ~usx/time_critical/sample_suite.def.


2.5  File systems

...

File systems have been set-up on ecgate and on on  the HPCF clusters for time critcal the UID which will be used to run the time critical applications: they are called /ms_crit on ecgate and /sc1/tcwork ec/ws1 and /sc2ec/tcwork ws2 on the current HPC systems (cca and ccb)Atos HPC system. These file systems are quota controlled and therefore you will need to provide User Support with an estimate of the total size and number of files which you need to keep on this file system.

...

If there is a need for a file system with different characteristics (e.g. to hold safely on line files for several days), these requirements can be discussed with User Support and a file system with the required functionalities can be made available.


2.6  Batch job classes / queues

A specific batch job queue has been set up on ecgate: it is called ”timecrit” and access is restricted to the UIDs authorised to run ”option 2” work only. This is the class/queue you should use to run any time-critical work on ecgate. If there are any non time-critical tasks in your suite (e.g. archiving tasks), these can use the other classes/queues normally available to users.

.6  Batch job classes / queues

Specific Similarly, on both HPCF clusters different specific batch job queues have been set up on the HPCF clusters with access restricted to the UIDs authorised to run ”option 2” work only. They are called ”ts”, ”tf” and ”tp”, respectively for sequential work, or fractional work (work using less than half of one node) and parallel work (work using more than 1 node). Again, you and parallel work (work using more than 1 node). There are the queues you should use to run any time-critical work on the Atos HPCF.  If there are any non time-critical tasks in your suite (e.g. archiving tasks), these can use the other classes/queues for any non time-critical queues normally available to users.  Archiving tasks should always use the nf queue and not be included as part of your parallel work.

When you develop or test a new version of your time-critical suite, we advise you to use the standard classes or queues available to all users. In this way, your time-critical activity will not be delayed by these testing or developments.

...

  1. Your work requires input data which is produced by any of the ECMWF models. In such case it is possible to set up a specific dissemination stream which will send the required data to either ecgate or the HPCF depending on the requirements of your suite. ECPDS has also been enhanced to allow for the dissemination to a specific User ID (the UIDs used to run time-critical work) so that only this recipient User ID can see the data. With this enhanced system, the recipient User ID will also become responsible for the regular clean-up of the received data. This will make the ”local” dissemination option similar to the standard dissemination to remote sites. This is the recommended option we would recommend.

    If produced by ECMWF, your required data will also be available in the FDB as soon as the relevant model has produced them and will remain online for a limited (variable depending on the model) amount of time. You can access these data using the usual ”mars” command. If your suite requires access to data which may no longer be contained in the FDB (e.g. EPS model level data from previous EPS runs) then your suite needs to access these data before they are removed from the FDB and temporarily store them in one of your disk storage areas.

    For no reason should any of your time-critical suite tasks depend on data only available from the Data Handling System (MARS archive or ECFS). Beware that the usage of the parameter ALL in any mars request will automatically redirect it to the MARS archive (DHS). Note also that we recommend you do not use abbreviations for a verb, parameter or value in your mars requests. If too short, these abbreviations may become ambiguous if a new verb, parameter or value name is added to the mars language.

  2. Your work requires input data which is available at ECMWF but not produced by an ECMWF model. For example, your work requires observations normally available on the GTS e.g. if you are interested in running some assimilation work at ECMWF. In such a case you can obtain the required observations from /volec/msbackup/ on ecgate the Atos HPCF where they are stored by a regular extraction task running as part of the ECMWF operational suite. For any other data you may need for your time-critical activity and which is available at ECMWF, please contact User Support.

  3. Your work requires input data which is neither produced by any of the ECMWF models nor available at ECMWF. You will then be responsible for setting up the required ”acquisition” tasks and establish their level of time criticality. For example, your suite may need some additional observations which improve the quality of your assimilation but your work can also run without them in case there is a delay/problem in their arrival at ECMWF. Please see the section ”Data transfers” for advice on how to transfer incoming data.

...

2.8.3 Transferring data between systems at ECMWF

We recommend the use of scp/rsync to transfer data between ecgate and the HPC systems. No use of NFS mounted file systems should be made to transfer data between the general purpose server ecgate and the HPCF systems. The ECGATE (ECS) and the HPCF (ECS) systems share the same file systems so there should be no need to transfer data between them. If you need to transfer data to other systems at ECMWF then we recommend that you use rsync We remind you not to use the DHS (MARS or ECFS) for any tasks in the critical path.

...

The UIDs authorised to run ”option 2” work have access to both all Atos HPCF clusters complexes and are advised to implement their suite so they are ready to run on the cluster they normally do not use if the primary cluster is unavailable for an extended period of time.

The two separate HPCF environments (currently only the /ec/sc1 ws1 and /ec/sc2 ws2 file systems) should be kept regularly synchronised using utilities such as ”rsync”.

...

Once your suite at option 2 is declared running in time-critical mode, we recommend you not to touch this suite any longer for new developments. We recommend that you define a similar suite in parallel to the time-critical one and that you first do the testing of changes under this suite. When you make important changes to your suite, we recommend that you inform your relevant User Support contact point and the ECMWF operators (newops@ecmwf.int)ECMWF via the Support Portal.

At ECMWF, we will set up the appropriate information channels to keep you aware of changes that may affect your time-critical activity. The most appropriate tool is a mailing list.

...