Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Comment: housekeeping: remove unused servers

If you wish to use ecFlow to run your workloads, ECMWF will provide you with ready-to-go ecFlow server running on an independent Virtual Machine outside the HPCF. Those servers would take care of the orchestration of your workflow, while all tasks in your suites would actually be submitted and run on HPCF. With each machine being dedicated to one ecFlow server, there are no restrictions of cpu time and no possibility of interference with other users.

Info
titleHousekeeping

Please avoid running the ecFlow servers yourself on HPCF nodes. If you still have one, please get in touch with us through the ECMWF support portal to discuss your options.

We may also remove any servers which are inactive for over 6 months. You will need to request a new one if you wish to use it again afterwards.


Show If
groupifs


Info
titleprepIFS and ecFlow

If you are running prepIFS experiments, those will not require you to set up a personal ecflow server. They will appear on preconfigured, dedicated servers for IFS experiments. Please refer to Migrating from Reading to Bologna for IFS users for more information.


Getting started

If you don't have a server yet, please raise an issue through the ECMWF support portal requesting one.

...

If you decide to store the jobs standard output and error on a filesystem only mounted on the HPCF (such as SCRATCH or HPCPERM), your ecFlow UI running outside the HPCF - such as your VDI, will not be able to access the output of those jobs out of the box. In that case you would need to start a log server on the Atos HPCF so your client can access those outputs. The logserver must run on the hpc-log node, and if you need a crontab to make sure it is running you should place it on hpc-cron:.

  1. Create a file 

Trapping of errors

It is crucial that ecFlow knows when a task has failed so it can report accurately what is the state of all your tasks in your suites. This is why you need to make sure error trapping is done properly. This is typically done in one of your ecFlow headers, for which you have an example in the  ECMWF git repository.

...

Of course, you may change queue to np if you are running bigger parallel jobs, or SCHOST to eventually run on other complexes than aaThis can be specified in the configuration file such as this one:

Bitbucket file
repoSlugecflow_include
branchIdrefs/heads/master
projectKeyUSS
filepathtroika.yml
progLangyml
collapsibletrue
applicationLinka675ea11-b2c4-336c-bfb6-077e786ef5b2
 

To use a custom troika executable or personal configuration file with Troika, ecFlow variables should be defined like this:

...