Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  • have learnt about how OpenIFS is installed and organised,
  • have run it at T21 resolution and know how to run it serially (1 process) and in parallel with both MPI & OpenMP,
  • know how to use grib tools to look at model output.

Connecting to the ECMWF HPC Facilities

In this tutorial, and throughout the entire workshop, you will use one of the Cray XC40 computers which are part of the ECMWF's high-performance computing facility (HPCF).

...

Panel
bgColor#f0f0ff
titleTasks - Connect to HPCF
  1. Start the Mobaxterm application and open a local terminal
  2. Login to the RACC using  ssh -X cluster.act.rdg.ac.uk
  3. On the cluster type the command:  ssh troifsX@ecaccess.ecmwf.int      Note:  Instead of troifsX you should use your ECMWF training user-ID

  4. You will be prompted for the hostname with a choice between ecgate, cca and ccb.  You should select ccb.
  5. When you have completed your work you can disconnect from ccb by typing exit at the command prompttyping exit at the command prompt.

(warning)  The following tasks need to be carried out only once in order to set up the training accounts and the OpenIFS 43r3 model.

Panel
bgColor#f0f0ff
titleTasks - Set up training account - do this only once after the first login
  1. After the first login type the following command:  /home/ectrain/troifs0/setup-cray/my43
  2. Log out of the training account by typing exit
  3. Login once more using  ssh troifsX@ecaccess.ecmwf.int

  4. Type the following command:  get43

The actions above will ensure that your training account receives required scripts and shell configuration files.

The last command will copy the model binaries and input files to your account for version 43r3.


Instead of using the login nodes of the HPCF we will use an interactive session to ask for computing resources and fast temporary disk. This also allows to run parallel jobs in the terminal window.

...

Panel
bgColor#f0f0ff
titleTasks - Start interactive session

% qsub -I -q df   -l EC_total_tasks=6 -l EC_job_tmpdir=10G 10G  -l EC_memory_per_task=2G
qsub: waiting for job 7215630.ccbpar to start
qsub: job 7215630.ccbpar ready

The changed command line prompt indicates that we have switched from the login node to a pre/post-processing node.

You can (warning)  Important:  After completing your work you need to close the interactive session by typing exit which will bring you back to the login node.

On the ECMWF HPCF an interactive session will last for 48 hrs by default, unless the walltime is specified using an additional directive.

OpenIFS directories

In this section we:

...

Panel
bgColor#f0f0ff
titleTasks - Set OpenIFS environment
  1. Carry out the tasks above to connect to ccb and start an interactive session
  2. Type the command:  setup-43r3
  3. Change into the model's main directory:  cd perm/oifs43r3
  4. Type the command:  source ./oifs-config.ccb.sh

The oifs-config.ccb.shThe setup-43r3 script sets a number of Unix shell environment variables which define the type of OpenIFS compiled installation and location of files. These settings are specific to version 43r3.

...

  • OpenIFS has been precompiled on the HPCF.
  • All source code has been removed due to licensing restrictions.
  • OpenIFS builds 'out-of-source'; this means object (.o) files and executables (binary files) are not mixed with the source code.
  • The README file contains information about software requirements, setting up the local compilation environment, and where to get help and support.

OpenIFS T21 test forecasts

...

The directory t21test contains a number of files:

Code Block
% ls t21test
ICMGGepc8ICMGGepc8INIT      ICMGGepc8INIUA  ICMSHepc8INIT  ifsdata job namelists
ICMGGepc8INIT  ICMSHepc8     ref_021_0072
ICMGGepc8INIUA  READMEifsdata/       namelists  job      ref_021_0144run.ppn

Files beginning with 'ICM'.
These are the input files for this T21 experiment. They are in GRIB format. Do not move them from this directory. OpenIFS expects to find its input files in the same directory as the main executable.

epc8              - this is the Experiment ID. Experiments IDs are used at ECMWF and initial conditions provided by ECMWF will always have an expt id.
ICMGGepc8  - 'GG' indicates these contain gridpoint fields.
ICMSHepc8  - 'SH' indicates these contains spherical harmonic fields.

job
Simple shell Shell script to run the model. Described in more detail belowdetail below.

run.ppn
Simple shell script which calls job in an interactive shell environment.

ifsdata
Climate data fields used for T21 test integration. You should not move or rename this directory as the model will expect to find the climate files it needs in a directory of this name.

...

Panel
bgColor#f0f0ff
titleTasks - Run model

 Run the model:

% ./jobrun.ppn

What happens?

The model fails. Look at the standard output (or in the NODE_001.01 file when it is created) and find the subroutine traceback. Near the top of the traceback you will find the error messages.

Whenever the model fails, it will produce this traceback (controlled by DR_HOOK=1 in the job file).

Single process test

...

Code Block
...
  17:09:08 STEP    1 H=   0:10 +CPU=  0.276
           STEP    1 :## EC_MEMINFO    1 ccbppn01     309     206       0     16530     190    16080    1188     35807   61171      0.2   0.0 s/p
  17:09:08 STEP    2 H=   0:20 +CPU=  0.280
           STEP    2 :## EC_MEMINFO    1 ccbppn01     309     206       0     16530     190    16011    1188     35740   61170      0.2   0.0 s/p
  17:09:08 STEP    3 H=   0:30 +CPU=  0.268
           STEP    3 :## EC_MEMINFO    1 ccbppn01     309     206       0     16530     190    16008    1188     35734   61170      0.2   0.0 s/p
  17:09:09 STEP    4 H=   0:40 +CPU=  0.264
           STEP    4 :## EC_MEMINFO    1 ccbppn01     309     206       0     16530     190    15966    1188     35695   61170      0.2   0.0 s/p
  17:09:09 STEP    5 H=   0:50 +CPU=  0.268
           STEP    5 :## EC_MEMINFO    1 ccbppn01     309     206       0     16530     190    16008    1188     35734   61171      0.2   0.0 s/p
  17:09:09 STEP    6 H=   1:00 +CPU=  0.004
           STEP    6 :## EC_MEMINFO    1 ccbppn01     309     206       0     16530     190    16008    1188     35734   61171      0.2   0.0 s/p

This test runs only 6 a small number of timesteps.

Model output

The model writes its output to a several files.

...

The model can also be set to use NPROC=2 and NTHREADS=2 to use a total of 4 processes. However, this would require a computer with at least 4 cores.

Acceptance testing

The final step is to check the model is producing the numerical answers within acceptable limits, even if it runs the short tests above without failing. OpenIFS includes code that will compute internal statistical norms and compare against numbers supplied by ECMWF. The file: ref_021_0144 in the t21test directory contains statistical norms computed by the model run at ECMWF.

Panel
bgColor#f0f0ff
titleTask - run acceptance test

 To do the acceptance test, edit the namelists in fort.4 and look for the NAMCT0 namelist:

Code Block
&NAMCT0
 LREFOUT=false,
 NSTOP=6,

change the number of timesteps to 72 to run the model for 12 hours (assuming you have not changed the default timestep of 10mins at T21) and set the variable LREFOUT to TRUE:

Code Block
&NAMCT0
 LREFOUT=true,
 NSTOP=72,



With LREFOUT=true, at the last timestep OpenIFS will read the ref_021_0144 file and produce a new file: res_021_0144 (note the similar filenames!). The contents of the file should be similar to:

...

As long as the model reports 'calculations are correct' and the error is less than a few percent then the model is behaving satisfactorily in your compilation and run environment.

How to control model output

...

In this example, the model will write 3 separate output files at the first timestep (0hrs), 3hrs and 9hrs and then no more regardless of how long the model runs for.

How to change the output variables and post-processing

...