ecFlow's documentation is now on readthedocs!

The task wrapper file does not normally need many changes, if the task designer sticks to the KISS principle, focusing on the functional aspect of the task.

  • In some situation, it might be just enough to define SMS variables as reference to ecFlow variables, on the relevant node (top server node, or suite node, or family node)
    • a variable may refer to another
      vars="SMSRID SMSTRYNO SMSNAME SMSSCRIPT SMSJOB SMSJOBOUT SMSDATE SMSTIME SMSCLOCK SMSKILLCMD SMSURLCMD SMSURLBASE SMSURL SMSPASS SMSNODESMSCMD SMSKILL SMSKILLCMD SMSCHECK SMSCHECKOLD SMSSTATUSCMD SMSCHECKCMD SMSOUT SMSTRIES"
      for var in $vars; do 
      case $var in 
      SMSCMD) ecf=ECF_JOB;; 
      SMSKILL*) ecf=ECF_KILL_CMD;; 
      SMSSTATUS*) ecf=ECF_STATUS_CMD;; 
      SMSCHECK_CMD*) ecf=ECF_CHECK_CMD;; 
      SMSURL*CMD*) ecf=ECF_URL_CMD;; 
      *) ecf=$(echo $var | sed -e 's:SMS:ECF_:');;
      esac
      node=/ # node=path_to_suite_or_family
      add=add # add=change
      ecflow_client --alter $add variable $var "%$ecf%" $node
      done
  • The file name is changed, ending with .ecf instead of .sms.

  • simply copy or link the original file from .sms into .ecf

  • alternatively, define a variable ECF_EXTN in the definition file:: edit ECF_EXTN .sms

    This requests that the ecFlow server uses .sms wrappers as the task template. In some cases, no files will need translation (no SMS variables, no CDP calls)

  • smsmicro is replaced with ecf_micro, when needed

    SMS

    ecFlow

    location

    SMSMICRO

    ECF_MICRO

    definition file

    %smsmicro

    %ecf_micro

    script .ecf .h

  • In ECMWF Operations, in the main branch, amongst 1394 files, only 43 use SMS system variables, i.e. variables whose name starts with SMS. Among all the suites MetApps is in charge of, amongst 3738 files, 216 are affected. Extracting these variables, we have:

    ============
    %SMS in .sms
    ============
    SMSCHECK
    SMSCHECKOLD
    SMSDATE
    SMSFILES
    SMSHOME
    SMSHOST
    SMSINCLUDE
    SMSJOBOUT
    SMSLOG
    SMSNAME
    SMSNODE
    SMSTRYNO
    SMSURLBASE
    SMS_PROG
    ============

    Similarly, we can identify all scripts that call the CDP text client.

    It is a good design principle to create tasks that are independent of SMS system variables. Only the tasks in charge of “advanced use” are concerned: SMSTRYNO was used to make a job aware of its instance number, enabling verbose output in case of rerun.

    One step translation consists of running the scripts through a filter that can be used for both expanded SMS definition files or for task wrappers:

    > sed -f sms2ecf-min.sed X.sms > X.ecf
    #!/bin/sed -f 
    /^  *action  */d 
    /^  *edit ECF_DATE  */d
    s:SMSNAME:ECF_NAME:g
    s:SMSNODE:ECF_NODE:g
    s:SMSPASS:ECF_PASS:g
    s:SMS_PROG:ECF_PORT:g
    s:SMSINCLUDE:ECF_INCLUDE:g
    s:SMSFILES:ECF_FILES:g
    s:SMSTRYNO:ECF_TRYNO:g
    s:SMSTRIES:ECF_TRIES:g
    s:SMSHOME:ECF_HOME:g
    s:SMSRID:ECF_RID:g
    s:SMSJOB:ECF_JOB:g
    s:SMSJOBOUT:ECF_JOBOUT:g
    s:SMSOUT:ECF_OUT:g
    s:SMSCHECKOLD:ECF_CHECKOLD:g
    s:SMSCHECK:ECF_CHECK:g
    s:SMSLOG:ECF_LOG:g
    s:SMSLISTS:ECF_LISTS:g
    s:SMSPASSWD:ECF_PASSWD:g
    s:SMSSERVERS:ECF_SERVERS:g
    s:SMSMICRO:ECF_MICRO:g
    s:SMSPID:ECF_PID:g
    s:SMSHOST:ECF_HOST:g
    s:SMSDATE:ECF_DATE:g
    s:SMSURL:ECF_URL:g
    s:SMSURLBASE:ECF_URLBASE:g
    s:SMSCMD:ECF_JOB_CMD:g
    s:SMSKILL:ECF_KILL_CMD:g
    s:SMSSTATUSCMD:ECF_STATUS_CMD:g
    s:SMSURLCMD:ECF_URL_CMD:g
    s:SMSWEBACCESS:ECF_WEBACCESS:g
    s:SMS_VERS:ECF_VERS:g
    s:SMS_VERSION:ECF_VERSION:g
    /edit ECF_INCLUDE/ {
    s:/include:/include_ecf:g
    }
    /edit ECF_INCLUDE/ {
    s:_prod:_prod_ecf:g
    }
    /edit ECF_FILES/ {
    s:_prod:_prod_ecf:g
    }
    s:smshostfile:ecf_hostfile:g
    s:sms_hosts:ecf_hosts:g
    
    

    Applying such a filter to all sms tasks can be simplfied:

    #!/bin/ksh
    files=`find -type f -name "*.sms" `  ## all sms wrappers
    for f in $files ; do
    ecf=$(basename $f .sms).ecf        ## ecf task name
    sed -f sms2ecf-min.sed $f > $ecf   ## translate
    diff $f $ecf > /dev/null && rm $ecf && ln -sf $f $g ## or link
    done

    SMS wrappers links can be preserved:

    #!/bin/ksh
    files=`find -type l -name "*.sms" `
    for f in $files ; do
    ecf=$(basename $f .sms).ecf        ## ecf task name
    link=$(readlink $f)
    dir=$(dirname $f); cd $dir
    ln -sf $link $ecf
    cd -
    done

    Special attention is needed for the variables renaming:

    SMS

    ecFlow

    SMSCMD

    ECF_JOB_CMD

    SMSKILL

    ECF_KILL_CMD

    SMS_STATUSCMD

    ECF_STATUS_CMD

    SMS_URLCMD

    ECF_URL_CMD

    It is not a good idea to systematically replace SMS with ECF_, for example, we use the variables NO_SMS and LSMSSIG which are not related to SMS.

  • If we want to run the the same job using both SMS and ecFlow, %SMSXXX% may be replaced with shell variables ECF_XXX. Then in a header file, we will define ECF_XXX=%SMSXXX:0% for sms mode and ECF_XXX=%ECF_XXX:0% for ecFlow mode.

  • All tasks calling CDP directly must be treated carefully and text client commands replaced with their ecFlow counterpart. They may force complete a family or a task, requeue a job or change a variable value:

    #!/usr/bin/env cdp
    cdp << EOF
    define ERROR {
      if(rc==0) then exit 1; endif
    }
    
    set SMS_PROG %SMS_PROG%
    login %SMSNODE% %USER% 1 ; ERROR
    suites -s %SUITE%
    loop task ( $missing ) do
      force -r complete /%SUITE%/%FAMILY%/tc\$task ; ERROR
    endloop
    exit
    EOF

    The ECF_PORT variable gives us the ability to discriminate between jobs under ecFlow control or not:

    #!/bin/ksh
    
    if [ %ECF_PORT:0% -gt 0 ] ; then
      for task in $missing; do
        ecflow_client --force complete recursive /%SUITE%/%FAMILY%/tc$task
      done
    else
    
    cdp << EOF
    define ERROR {
      if(rc==0) then exit 1; endif
    }
    
    set SMS_PROG %SMS_PROG%
    login %SMSNODE% %USER% 1 ; ERROR
    suites -s %SUITE%
    loop task ( $missing ) do
      force -r complete /%SUITE%/%FAMILY%/tc\$task ; ERROR
    endloop
    exit
    EOF
    fi
    
  • sms child commands may also be called in few sms task wrappers. These should again be replaced with their ecFlow equivalents.

 

There is no right way to do this. It is simple to design a task whose language is pure python or pure perl. We tend to use ksh scripting for task templates for the following reasons:

  • trap ERROR 0: to prevent early exit from the script and call the ERROR if exited
  • set -e: to raise an error if a command exit status is not 0
  • set -u: to prevent undefined variable usage
  • set -x: to display each command before execution
  • PS4 variable: to allow time stamping and evaluate each lines runtime
  • trap: to redirect internal/external signal reception to an ERROR function

Task headers can be used to make common what can be shared among multiple tasks (head.h, tail.h, trap.h, rcp.h, qsub.h).