Skip to end of metadata
Go to start of metadata

Heikki Järvinen, Sami Saarinen, Per Undén

It describes the blacklist language as well as its usage in IFS

1 Introduction

 

In the operational suite on Cray computer, the blacklist was basically a list of undesired stations to be excluded from the analysis in operations, and usually in prepan experiments, too, based on monthly monitoring by the Operations Department. The technique for blacklisting has been streamlined as a part of the migration of operational codes from Cray to Fujitsu.

A new blacklist format has been introduced that allows a great deal more flexibility in decision making on the use of observations. The blacklist now consists of two parts: data selection part and monthly monitoring part. Data selection part contains information about which variables will be used in the assimilation, and it should be amended only rarely, except in experimentation. The monthly monitoring part, on the other hand, will be updated fairly frequently as a result of data monitoring. The former automatic ship blacklist is not supported any more.

This guide comprehensively describes the format of the blacklist language developed at ECMWF during the migration project in 1995-96 based on initial idea by Mats Hamrud.

2 The Blacklist language

 

The way the blacklisting now works in the IFS context is as follows. One edits a blacklist file which is written in a specific format. That file is then converted into a subroutine (C language) using the blacklist compiler. The subroutine is then compiled and linked into the executable. This external routine is called from the IFS with a list of arguments in the observation screening run. IFS then receives a few flags telling whether to reject or accept this station or variable for assimilation. The following example will clarify the consepts used in blacklisting.

 

if (OBSTYP = synop) then
if VARIAB in (u10m, v10m)
and LSMASK = land
and abs(LAT) < 25 then
fail(constant);
endif
endif;

 

There are several patterns in this single blacklisting rule and in the following they will be called:

 

  • variables, like OBSTYP, VARIAB, LSMASK, LAT (see 2.1)
  • keywords, like synop, u10m, v10m, land, constant (see 2.2)
  • statements, like if-then-elif-endif-block (see 2.3.1)
  • operators, like and, in, =, < (see 2.3.2)
  • built-in functions, like abs (see 2.4)
  • actions, like fail (see 2.5)

 

Variables get their values from IFS. These are compared against the keywords or values given in the blacklist. If the blacklist rule is true, fail-function takes action activating blacklisting flags and returning back to the calling routine in IFS. Note that the blacklist language is case insensitive and no column orientation is required.

 

2.1 Variables

A list of variables that are currently defined in IFS is given below. Adding new variables, see for 5.3.
 

2.1.1 Report characteristics

The up to date list of variables related to observation header and model fields can be found on HPCF in the external file of our blacklist (for instance /home/rd/rdx/data/37r3/an/external_bl_mon_monit.b for CY37R3).

 Variable Meaning Possible values
 obstyp observation type Keyword (as listed below)
 statid station id Right justified 8 character string
 codtyp code type Integer value as defined in IFS
 instrm instrument type Integer value as defined in IFS
 date date Packed integer YYMMDD
 time time Packed integer HHMMSS
 lat latitude Real value in degrees (-90<=LAT<=90)
 lon longitude Real value in degrees (-180<LON<=180)
 stalt station altitude Real value in metres
 line_sat line position atovs Integer value
 retr_type retrieval type Integer value
 qi_fc EUMETSAT Quality Indicators: with forecast dependence 
 rff CIMSS Quality Indicator: Recursive Filter Flag 
 qi_nofc EUMETSAT Quality Indicators: without forecast dependence 
 sensor satellite sensor indicator (for RTTOV) Integer value
 fov field of view number Integer value
 satza satellite zenith angle Real value (in degrees)
 nandat analysis date Packed integer YYMMDD
 nantim analysis time Packed integer HHMMSS
 soe solar elevation Real value
 qr quality of retrieval 
 clc cloud cover 
 cp cloud top pressure 
 pt product type Integer value
 sonde_type sonde type Integer value
 specific amsua=clwp on sea 
 gen_centre Generating centre Integer value (WMO defined)
 gen_subcentre Generating sub-centre Integer value (WMO defined)
 datastream Data stream (see datastream in odb) Integer value
 ifs_cycle six digit IFS-cycle f.ex 331001 for CY33R1.001 6 digit integer value
 retrsource retrieval source Integer value
 surftype surface type indicator 
 sza solar zenith angle Real Value
 reportype MARS reportype Integer value for MARS archiving
 solar_hour solar hour Real value
 satellite_identifier satellite identifier Integer value
station_identifierstation identifier (for some conventional only)Integer value (similar to statid but for integer values only)

 

2.1.2 Model/first guess characteristics

 Variable Meaning Possible values
 modps model surface pressure Real Value
 modts model surface temperature Real Value
 modt2m model 2 metre temperature Real Value
 modtop  model top level pressure (hPa) Real Value
 sea_ice model sea-ice fraction  Real Value

 

2.1.3 Observation characteristics

External variables (SPECIAL, i.e. related to obs. body entry only)

 Variable Meaning Possible values
 variab variable name (varno in ODB)  Integer value
 vert_co  type of vert. coord.  Integer value
 press pressure (hPa)  Real value
 press_rl ref. level press. (hPa)  Real value
ppcode  synop press. code  Integer value
 obs_value observed value  Real value
 fg_departure  first guess depart.  Real value
 obs_error observation error  Real value
 fg_error first guess error  Real value
 winchan_dep window chan dep  Real value
 obs_t Obs temperature at same level, for R/S only.  Real value
 elevation Radar elevation  Real value
 winchan_dep2 alternative window chan dep  Real value
 tausfc Surface transmittance for AIRS screening.  Real value
 csr_pclear  percentage of clear pixel (GEOS) Real value

 

2.2 Keywords


Keywords are fixed values against which certain variables are compared. They should be consistent with the IFS definitions. A list of keywords that are currently defined in the blacklist (in the external file of our blacklist). Adding new keywords is straightforward.

 Variable Keyword
 OBSTYP

 synop, airep, satob, dribu, temp, pilot, satem, paob, scatt, limb, gbrad

(or integer values as defined in IFS)

 CODTYP

rtovs, tovs, ssmi, meris, am_profiler, jp_profiler, eu_profiler, templand, tempship, dropsonde, reo3, metar, pgps, radar_rr, rad1c, satem500, satem250

(or integer values as defined in IFS)

 SENSOR hirs, msu, ssu, amsua, amsub, ssmi_sensor, vtpr1, vtpr2, tmi, ssmis, airs, mhs, iasi, amsre, meteosat, msg, geosimg, mtsatimg, windsat, mwts, iras, mwri, envisat
 INSTRM mipas, gome, gomos, sciamachy, seviry, gome2, omi, toms, sbuv, auramls, iasi_reo3, modis_sensor, mopitt
 VARIAB u,v,z, z, dz, rh, q, pwc, rh2m, t, td, t2m, td2m, ts, ptend, w,ww, vv, ch, cm, cl, nh, nn, hshs, c, ns, s, e, tgtg, spsp1, spsp2, rs, eses, is, trtr, rr,jj,vs,ds, hwhw, dwdw, gclg, rhlc, rhmc, n, snra, ps, dd, ff, rawbt, rawra, satcl, scatss, du, dv, u10m, v10m, rhlay, auxil, cllqw, ambigv, ambigu, apdss, ro_bangk, rrefl, o3, hlos, no2, so2, co, hcho, go3, co2, ch4, aod, rao, od, rfltnc, lnprc
 LSMASK sea, land
 RLMASKtovsland
 PPCODE psealev, pstalev, g850hpa, g700hpa, p500gpm, p1000gpm, p2000gpm, p3000gpm, p4000gpm, g900hpa, g500hpa
 VERT_CO pressure, height, tovs_cha, sca
 RETR_TYP for TOVS cloudy, partly_cloudy, clear
 RETR_TYP for Satob wvcl, ir, vis, wv, comb_spec_channels, wvmw, wvcl1, wvcl2, wvcl3, ir1, ir2, ir3, vis1, vis2, vis3, wvmw1, wvmw2, wvmw3
 SONDE_TYPE for radiosondes st_avk_mrz, st_rs80_usa, st_rs80, st_rs90, st_viz
 DATASTREAM ears, pacrars, dbmodis
 ODB constants rmdi, ndmi (real values as defined in ODB)

 

 

 

 

 


 

 


 

2.3 Statements and operators

 

2.3.1 IF-statement syntax


The IF-statement syntax (note the semicolon (;) after each statement):

 Syntax Meaning

 

 if (condition) then
      statement_1;
      statement_2;
      etc.

 elif (condition) then
   statement_1;
   statement_2;
   etc.

 else
   statement_1;
   statement_2;
   etc.

endif 

IF-test with optional ELIF/ELSE-blocks.

Nested IF-tests are valid in every statement. Every IF-THEN or IF-THEN-ELSE must match an ENDIF

Condition can be any logical or arithmetic operation.

2.3.2 List of the simple operators

 

A list of operators that are currently defined in the Blacklist-language:

 


 

2.3.3 List of more complex operators

Somewhat more complex operators can also be used to simplify coding. For example the compound AND-operators below:


 

2.4 Built-in functions


The Blacklist-language also contains some built-in functions. They are listed below:


 

 


 

In addition, there is one special function to study whether a point is within a circular area on the Earth (e.g. to blacklist Meteosat SATOBs if they are too far away):

if (not (rad (0, 0, 45, LAT, LON))) then fail(monthly); endif;

The function is called rad() and requires five (5) arguments. It returns one (1) if the observation is within the circle, otherwise zero (0). The usage is

rad(reflat, reflon, refdeg, LAT, LON)

where the refdeg is radius of the circle on the Earth with the (reflat, reflon) as a center point of the circle. The (LAT, LON) is the position of the observation to be checked, i.e. LAT and LON of the report. All values are given in degrees. See also picture 2.1.

 

  
Figure 2.1: Schematic view of the rad()-function parameters.

 

The following arithmetic is performed in the function rad():

  1. Convert all degrees to radians
  2. Calculate angle distance (in radians) relative to the center point
    obsdeg = acos( cos(reflat) cos(LAT) cos(LON-reflon) + 
    sin(reflat) sin(LAT) )
  3. Return one from rad, if obsdeg ≤ refdeg, otherwise zero.

2.5 Actions


Finally, perhaps the most important function fail(). It returns information back to the application.

The fail()-function is a variable number argument function. If no arguments are given, the first argument is assumed to contain keyword monthly, i.e. rejection occurs in the monthly monitoring part of the blacklist-file. If the second argument -- seriousness of the blacklisting -- is omitted, then seriousness is assumed to be equal to one.

Arguments in the fail(arg1, arg2)-function are:

 Argument#1 (arg1)
 Meaning
 monthly monthly monitoring (default)
 constant constant blacklisting
 experimental experimental blacklisting
use_emiskf_only emiskf blacklisting
 Argument#2 (arg2)
 Meaning
 level

 Level of seriousness of blacklisting
Range is between [0..1]. Default =1

When a call to the fail()-function occurs, the control is returned immediately to the calling application. Normally the application is the IFS, which will get the following (Fortran) variables updated:

 Variable Type Meaning
 NCMBLI Integer

 Blacklisting indicator

0= not blacklisted (default)
1= monthly monitoring
2= constant monitoring
3=experimental
4= use for emiskf only 

 ZCMCCC Real

 Seriousness of the blacklisting

0= Default if not blacklisted
1= Default if blacklisted (i.e. NCMBLI > 0)
[0.01...0.99] for non-complete blacklisting (optional)

 FEEDBACKInteger 

 Feedback vector telling which variable(s) caused the blacklisting to occur:

0 = Blacklist line number where the fail()-function took action
1-N = Pointers to the variable indices to help to locate the responsible variables

There is a range of values for ZCMCCC, and together with other information in the quality control, and a value less than one may still lead to the use of this variable in the assimilation. The inclusion of this option of non-strict blacklisting increases flexibility of the use of observations.

2.6 Variable declaration

Variable declaration has to be performed, if data will be passed from an application (like IFS) into the blacklist. This is normally done through external-declaration (see for 4.2 or 5.1). Also, selected variables can be protected by defining them as constants.

Additional or local variables can be defined everywhere in the code, even within the IF-THEN-ELSE-ENDIF -block (except in IF-condition). However, any attempt to use undeclared or uninitialized variables will cause the Blacklist-compilation to fail.

The simplest variable declaration is an assignment operation.


 

3 Operational and experimental use of blacklist

3.1 Location of blacklist files

3.2 Some guidelines

Please do not place any station identifiers into the data selection part of the blacklist. Instead, have them in the monthlt monitoring part. By this way we can have as few changes as possible in the data selection part and make e.g. re-analysis much easier.

After any modifications to the blacklist, please remember to recompile (preferably on a workstation) to check for syntax errors.

4 Creating new blacklist file

Blacklist compilation is fully controlled by the script called blcomp. It has the following capabilities:

  • Optionally convert from an old ASCII blacklist format to a new format

     

  • Check the syntax of a given blacklist

     

  • Create C-language file ( C_code.c) catered for observation processing

     

  • C-compile the C-file to create linkable object

4.1 Usage of the blcomp

The blcomp-script has the following usage:

blcomp [-aAcCdDefiILmMnoOpSx8] blacklist_file.b (or blacklist_file.B)

where the flags are as follows:

 


 

The new BLACKLIST-file must have either suffix ".b" or ".B". In the latter case the C-preprocessor /lib/cpp will be run in the front of BL-compiler mainly to resolve any possible #include-statements.

For pure syntax checking of the new BLACKLIST-file, give:

blcomp blacklist_file.b
or
blcomp blacklist_file.B

By giving blcomp without arguments you will get the usage. If you fail to do this, check for your setting of the PATH-environment variable.

4.2 Conversion from old to new blacklist


Conversion from old to new and syntax checking of the new BLACKLIST-file can be accomplish in the following way:

blcomp -o old_text_blacklist_file newfile.b
or
blcomp -o old_text_blacklist_file newfile.B

 

Here, the input file is old_text_blacklist_file, and output file is newfile.b (or newfile.B) in the new blacklist format.

While converting from old to new format, the used suffix .b or .B of the new blacklist file plays an important role. First of all, there MUST always be one suffix. When the suffix is .b, then a single blacklist file (here: newfile.b) will be created with all external (e.g. variable declarations) and monthly monitoring rules (a portion of blacklist that normally does not change during one month period) inlined.

If the suffix .B was used, then the following three (3) files are generated:

  • master file ( newfile.B)
  • include-file no. 1 for externals ( external_newfile.b)
  • include-file no. 2 for monthly part ( monthly_newfile.b)

The contents of the master file is simply the following two lines:

#include "external_newfile.b"
#include "monthly_newfile.b"

One way to bring in your own modifications, is to create a new master-file, for example:

#include "external_newfile.b"
#include "my_own_file"
#include "monthly_newfile.b"

 

This is exactly how the data selection part comes in in the production run, where instead of my_own_file is data selection part.

4.3 C-code generation

Enabling fast blacklist handling the blacklist file is always converted into an object file ( .o) meant to be linked with the (Fortran-)application (like IFS) in conjunction with the blacklist object library (normally libbl95.a).

Once a blacklist file (either with .b or .B suffix) is available, it can be converted to C-language file C_code.c and compiled to an object for maximum performance. This can be done as follows:

blcomp -c blacklist_file.b
or
blcomp -c blacklist_file.B

 

4.4 Linking with an application

A Fortran-application (IFS) interfaces the blacklist via two subroutines:

  • BLACKBOX_INIT
  • BLACKBOX

The former one is responsible for initiating the variable list active by the application. And the latter one handles all burden of interfacing the blacklist file.

To link application with the blacklist software, one needs not only the C_code.o-object file, but also the blacklist library libbl95.a. Linking command is normally:

linker application.o C_code.o /bl95path/libbl95.a other_libs

 

The exact location of the blacklist library can be found via command:

blcomp -L

4.5 Combining conversion and object generation

If no data selection part is needed, one can combine conversion from old to new blacklist and object code generation described above:

blcomp -c -o old_text_blacklist_file newfile.b  
or
blcomp -c -o old_text_blacklist_file newfile.B

4.6 User interface

It is always recommended to (cold-)compile a modified blacklist on a workstation to check for syntax errors. If any errors are detected, the blcomp-command attempts to open an editor session and jump directly to the line where the (first) error occurred.

Sometimes this facility is not desirable and can be disabled by using -i flag in the blcomp-command.

5 Examples

The blacklist file is normally about 1 000 lines long. In order not to confuse readers, we will explain here with very short examples what can be done with the blacklist

5.1 A simple example

A fraction of an old blacklist ( old) looks like as follows:

     3ELC  1 3
ELBX3 1 333
N503US 2 00030
UAL... 2 00030
024 3 33000000 033333
0// 3 33000000 033333
46527 4 33300
ERES 5 000003
08221 6 0330
201 7 33300000 00333

 

When compiled with blcomp -o old new.b, we get a new file new.b. The local constant variable declaration section looks as follows:

!
! Written by an automatic conversion program, version 3
!
!
! File converted from the file "old"
!

! FAILCODE :
const monthly = 1;
const constant = 2;
const experimental = 3;
const whitelist = 4;

! OBSTYP :
const synop = 1;
const airep = 2;
const satob = 3;
const dribu = 4;
const temp = 5;
const pilot = 6;
const satem = 7;
const paob = 8;
const scatt = 9;

! CODTYP : none

! INSTRM : none

! VARIAB :
const u = 3;
const v = 4;
const z = 1;
const dz = 57;
const rh = 29;
const q = 7;
const pwc = 9;
const rh2m = 58;
const t = 2;
const td = 59;
const t2m = 39;
const td2m = 40;
const ts = 11;
const ptend = 30;
const w = 60;
const ww = 61;
const vv = 62;
const ch = 63;
const cm = 64;
const cl = 65;
const nh = 66;
const nn = 67;
const hshs = 68;
const c = 69;
const ns = 70;
const s = 71;
const e = 72;
const tgtg = 73;
const spsp1 = 74;
const spsp2 = 75;
const rs = 76;
const eses = 77;
const is = 78;
const trtr = 79;
const rr = 80;
const jj = 81;
const vs = 82;
const ds = 83;
const hwhw = 84;
const pwpw = 85;
const dwdw = 86;
const gclg = 87;
const rhlc = 88;
const rhmc = 89;
const rhhc = 90;
const n = 91;
const snra = 92;
const ps = 110;
const dd = 111;
const ff = 112;
const rawbt = 119;
const rawra = 120;
const satcl = 121;
const scatss = 122;
const du = 5;
const dv = 6;
const u10m = 41;
const v10m = 42;
const rhlay = 19;
const auxil = 200;
const cllqw = 123;
const scatdd = 124;
const scatff = 125;

! LSMASK :
const sea = 0;
const land = 1;

! PPCODE :
const psealev = 0;
const pstalev = 1;
const g850hpa = 2;
const g700hpa = 3;
const p500gpm = 4;
const p1000gpm = 5;
const p2000gpm = 6;
const p3000gpm = 7;
const p4000gpm = 8;
const g900hpa = 9;
const g1000hpa = 10;
const g500hpa = 11;

! VERT_CO:
const pressure = 1;
const height = 2;
const tovs_cha = 3;
const scat_cha = 4;

 

The external variable definition section looks as follows:

! External variables (non-special):
external obstyp;
external_CHAR statid;
external codtyp;
external instrm;
external date;
external time;
external lat;
external lon;
external stalt;
external modoro;
external lsmask;
external rad;

! External variables (SPECIAL):
external variab is SPECIAL;
external vert_co is SPECIAL;
external press is SPECIAL;
external press_rl is SPECIAL;
external ppcode is SPECIAL;
external obs_value is SPECIAL;
external obs_departure is SPECIAL;
external modps is SPECIAL;

 

And finally the actual monthly monitoring rules in a new blacklist format:

if ( OBSTYP = synop ) then
if VARIAB in ( z, ps )
and STATID = " 3ELC"
then fail(); endif;

if VARIAB in ( z, ps, u10m, v10m )
and STATID = " ELBX3"
then fail(); endif;

return; endif;

if ( OBSTYP = airep ) then
if (VARIAB = t)
and STATID in ( " N503US", " UAL...")
then fail(); endif;

return; endif;

if ( OBSTYP = satob ) then
if STATID in ( " 0//", " 024")
then fail(); endif;

return; endif;

if ( OBSTYP = dribu ) then
if VARIAB in ( z, ps, u, v )
and STATID = " 46527"
then fail(); endif;

return; endif;

if ( OBSTYP = temp ) then
if (VARIAB = z)
and STATID = " ERES"
then fail(); endif;

return; endif;

if ( OBSTYP = pilot ) then
if VARIAB in ( u, v )
and STATID = " 08221"
then fail(); endif;

return; endif;

if ( OBSTYP = satem ) then
if STATID = " 201"
then fail(); endif;

return; endif;

5.2 A more complex example

The Blacklist compiler will generate quite a compact and readable code from the following excerpt:

 

     ATQM  1 3
ATRK 1 3
ATSR 1 3
C6BB 1 3
C6QK 1 3
AN... 2 33333 50 10
NWA74 2 33333 -90 90 -40 -80
035 3 33000000 033333 -50 50 -50 50 1000 401
104 3 33000000 033333 -50 50 90 -170
20674 5 000003 100 10 11 13
40179 5 033000 05 07
40179 6 0330 05 07

 

The constant definition is not different from the previous example. For the monthly monitoring rules in a new blacklist format becomes:

 

if ( OBSTYP = synop ) then
if VARIAB in ( z, ps )
and STATID in ( " ATQM", " ATRK", " ATSR", " C6BB", " C6QK")
then fail(); endif;

return; endif;

if ( OBSTYP = airep ) then
if ( 50 >= PRESS >= 10 )
and STATID = " AN..."
then fail(); endif;

if ( ( LAT < -90 or LAT > 90 ) or ( -80 < LON < -40 ) )
and STATID = " NWA74"
then fail(); endif;

return; endif;

if ( OBSTYP = satob ) then
if ( ( LAT < -50 or LAT > 50 ) or ( -170 < LON < 90 ) )
and STATID = " 104"
then fail(); endif;

if ( ( LAT < -50 or LAT > 50 ) or ( LON < -50 or LON > 50 ) )
and ( 1000 >= PRESS >= 401 )
and STATID = " 035"
then fail(); endif;

return; endif;

if ( OBSTYP = temp ) then
if (VARIAB = z)
and ( 100 >= PRESS >= 10 )
and ( 110000 <= TIME <= 130000 )
and STATID = " 20674"
then fail(); endif;

if VARIAB in ( u, v )
and ( 50000 <= TIME <= 70000 )
and STATID = " 40179"
then fail(); endif;

return; endif;

if ( OBSTYP = pilot ) then
if VARIAB in ( u, v )
and ( 50000 <= TIME <= 70000 )
and STATID = " 40179"
then fail(); endif;

return; endif;

 

 

5.3 Adding completely new variable to the system


The current definition of variables can be checked from IFS source code in obs_preproc/blinit.F90. Adding new variables requires:

  1. Never remove or redifine existing variables. That will make re-running earlier cases virtually impossible.
  2. Add the new variable in the SQL requests black_rob*.sql. If the new variable is not in hdr or body but in some data-specific tables (e.g. sat, or conv), you need to modify *only* those requests that are relevant for those data and have access to these tables.
  3. Add a variable to the IFS source code in obs_preproc/blinit.F90.
  4. Increase the number of defined variables in obs_preproc/blinit.F90.
  5. External declaration must be done into the external-file.
  6. Before starting to use the new variable, initialize it properly in obs_preproc/black.F90. If the new variable is not in hdr or body but in some data-specific tables (e.g. sat, or conv):
    • make sure the variable is always initialized, and
    • put some logic in place (e.g. IF (IOBTYP == NSYNOP)...) in order to populate, only when appropriate, the variable with values from the sql.
  7. The new variable can now be added into the blacklist. If keywords are associated with, declare them in the external-file as well.

 

 

  • No labels