Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Panel
bgColorwhite
titleBGColorlightgrey

 OpenIFS OpenIFS can fail with Intel compiler at -O2

There is an issue with OpenIFS when compiling with the Intel compiler at optimization level -O2 or above on chipsets that support SSE4.1 & AVX instructions. Intel compilers are generally more aggressive at optimisations for -O2 than other compilers.

Users will see failure with the T21 test job similar to the following:

Code Block
titleSample failure message
collapsetrue
signal_harakiri(SIGALRM=14): New handler installed at 0x432110; old preserved at 0x0
 ***Received signal = 8 and ActivatED SIGALRM=14 and calling alarm(10), time = 6.18
 myproc#1,tid#1,pid#27600,signal#8(SIGFPE): Received signal :: 123MB (heap), 125MB (rss), 0MB (stack), 0 (paging), nsigs 1, time 6.18
 tid#1 starting drhook traceback, time = 6.18
 myproc#1,tid#1,pid#27600: MASTER 
 myproc#1,tid#1,pid#27600: CNT0<1> 
 myproc#1,tid#1,pid#27600: CNT1 
 myproc#1,tid#1,pid#27600: CNT2 
 myproc#1,tid#1,pid#27600: CNT3 
 myproc#1,tid#1,pid#27600: CNT4 
 myproc#1,tid#1,pid#27600: STEPO 
 myproc#1,tid#1,pid#27600: SCAN2H 
 myproc#1,tid#1,pid#27600: SCAN2M 
 myproc#1,tid#1,pid#27600: GP_MODEL 
 myproc#1,tid#1,pid#27600: EC_PHYS_DRV 
 myproc#1,tid#1,pid#27600: >OMP-PHYSICS CLDPP T/S (1002) 
 myproc#1,tid#1,pid#27600: EC_PHYS 
 myproc#1,tid#1,pid#27600: CALLPAR 
 myproc#1,tid#1,pid#27600: SLTEND 

It arises because this compiler makes use of 2-way vectorization when compiling both branches of IF statements which can generate floating point exceptions if a zero divide is possible in the unexecuted branch and the IFS internal signal handler (DRHOOK) is enabled.

There are several possible workarounds:

  1. Compile the routines that cause the problem with lower optimisation, -O1. The routines affected are: sltend.F90, vsurf_mod.F90, vdfmain.F90, vdfhghtn.F90.
  2. Run with the environment variable: DR_HOOK_IGNORE_SIGNALS=8 to disable trapping of floating point exception signals (SIGFPE) by the model. This is not ideal as it will not catch other causes of floating point exceptions.
  3. Edit the code and insert the line:

    Code Block
     !DEC$ OPTIMIZE:1

    directly after the SUBROUTINE statement into the routines: sltend.F90, vsurf_mod.F90, vdfmain.F90, vdfhghtn.F90.

  4. Edit the intel-*.cfg configuration files in make/cfg and add lines to change the compile options specifically for these files.

OpenIFS uses a default of -O1 in the configuration files. If you increase the optimisation level, please be aware of this issue.

For more help with this issue, please contact openifs-support@ecmwf.int.

...

Panel
bgColorlightblue
borderStylenone

GNU compilers

Panel
bgColorwhite

OpenIFS 38r1 fails with gfortran version 5 compiler

OpenIFS 38r1 is known to fail when using the gfortran/gcc version 5.2 compiler. The error is:

Code Block
SUDIM1; after call to read(namgfl), nmfdiaglev =            0
		Error in `../make/gnu-noopt/oifs/bin/master.exe': double free or corruption (out): 0x0000000009fafd90 ***

If this occurs we recommend using version 4.8.1 of the gnu compilers. There is currently no fix for this issue with OpenIFS based on the 38r1 release.

Later versions of OpenIFS (40r1+) do not fail.

 

 

Panel
bgColorlightblue
borderStylenone

Cray

Panel
bgColorwhite
titleBGColorlightgrey

Cray ATP does not work

This is caused by the way IFS creates its own signal handler. To enable Cray ATP set:

Code Block
export DR_HOOK_IGNORE_SIGNALS=-1

in the job script to completely disable any signal trapping by the IFS signal handler code 'DrHook'.

Contact openifs-support@ecmwf.int for assistance.

...