Manual

When zombie s arise they can be handled manually by ecflowview. (See Zombie) or via the command line interface:

Automated

It is also possible to ask ecflow_server to make the same response in an automated fashion. How ever very careful consideration should be made before doing this. Otherwise it could mask a serious underlying problem.

The automated response can be defined with:

The zombie attribute is inherited in the same manner as Variable inheritance.

Example: For tasks under suite “s1” add a zombie attribute, such that child label commands(i.e ecflow_client –label) never blocks the job: (not strictly needed as this is the default behaviour in release 4.0.5 onwards)

Example: For tasks under suite “s1” add a zombie attribute, such that job that issues the child commands( event, meter, label) never blocks: (not strictly needed as this is the default behaviour in release 4.0.5 onwards)

Example: For all tasks under family “critical”, if any zombies arise then fail the job:

 

Here are some further example of using --alter:

You can only add one zombie attribute of each time(ecf,path,user).

To delete a zombie attribute, please use one of:

Here are some more examples:

       ecflow_client --alter add zombie "ecf:kill:init,complete:" /suiteZ

             ecflow_client --alter add zombie "user:kill::" /suiteZ

       ecflow_client --alter add zombie "ecf:adopt:complete:" /suiteZ

Semi-Automated

Sometimes zombies can arise for more obscure reasons. i.e The job sends a --init message to the server, meanwhile the server is busy(i.e processing jobs), when finally the server makes the task active, and sends a message back to the client/job the ecflow_client has timed out. This causes the ecflow_client to send the same message again. However this time the server treats the command as a zombie, since the task is already active.

These scenario's are very rare, but tends to happen, for the following situations:

To diagnose these cases, we need to look at the log file. Typically you will see two or more --init/complete commands, where the second will then be treated as a zombie.

To get round these issue you can add a variable ECF_NONSTRICT_ZOMBIES, which will reduce these false zombies.

       ecflow_client --alter add variable ECF_NONSTRICT_ZOMBIES 1 /              # adds the variable to the root/server level, and hence affect all suites on the server

       ecflow_client --alter add variable ECF_NONSTRICT_ZOMBIES 1 /suiteX      # adds the variable at the suite level,, and hence only affects this suite.