Data format
VBar-delimited format
Data file consists of score values and corresponding metadata in an ASCII format.
The first line contains a tag setting the version of the file format:
#version=1.0
There is currently only one version of the format: 1.0.
Ensuing lines contain data records. Each record has the following format:
centre | model_id | yyyymm | time | forecast_step | station_id | latitude | longitude | station_elevation | model_orography_elevation | parameter | score | event | sample_size | score_mean_value |
Record format
Data file consists of score values and corresponding metadata in an ASCII format.
Every score value is described by the full set of key attributes, like its parameter, station id, month, step etc. Parameters describing one score value at one station are organised into a record. Each record corresponds to one score value. A record is a collection of pairs key=value separated by commas. A record spans one line. Value of the key which is not given in the current record is inherited from the previous record, except for the value parameter v which has to be present in each record.
Each record has the following format:
centre=centre, model=model_id, d=yyyymm, t=time, st=station_id, lat=latitude, lon=longitude, se=station_elevation, me=model_orography_elevation, par=parameter, sc=score, ev=event, n=sample_size, v=mean_value |
If the value is not available the record either should not be given at all or the value should be set to NIL (v=nil); every record must contain the key v (value of key v is not inheritable from previous record).
Parameter keys
- centre (4-characters string) is the WMO identifier of the originating centre (ammc, cwao, ecmf, edzw, egrr, kwbc, lfpw, rjtd, rksl, rums etc);
model_id (a string, not containing a comma or vertical bar) is free model identifier assigned by the originating centre (to distinguish between potentially different models provided by the centre);
- yyyymm is the month of the mean, where yyyy is the year and mm is the month (01-12);
- time is the validity time (in hours UTC) of the forecasts verified;
- forecast_step is the length of the forecast (in hours);
- station_id (a number) is the WMO ID of the observation station verifying the forecasts;
- latitude is the latitude of the observation station verifying the forecasts;
- longitude is the longitude of the observation station verifying the forecasts;
- station_elevation is the elevation of the observation station above the mean sea level in meters;
- model_orography_elevation is the elevation of the model orography at the observation location;
- parameter is the verified model output parameter:
parameter | description | unit |
---|---|---|
t2m | air temperature at 2 meters above the model orography, corrected to the actual station elevation using the constant lapse rate 6.5K/1000m | K |
td2m | dewpoint at 2 meters above the model orography | K |
rh2m | relative humidity at 2 meters above the model orography | % |
tp06 | total precipitation accumulated over previous 6 hours | mm |
tp24 | total precipitation accumulated over previous 24 hours | mm |
ff10m | wind speed at 10 meters above the model orography | m/s |
dd10m | wind direction at 10 meters above the model orography | deg |
tcc | total cloud cover | 0-1 (convert to okta for contingency table) |
- score is the name of the verification score or statistic:
score | description | |||||||||
---|---|---|---|---|---|---|---|---|---|---|
me | mean error (bias) | |||||||||
mae | mean absolute error | |||||||||
rmse | root mean square error | |||||||||
ct | contingency table, the 4 values are misses, hits, correct non-events, and false alarms (in this order)
|
- event is the name of the event (for contingency tables)
event | description |
---|---|
val>thr | forecast/observed value greater than a threshold value of the forecast parameter, e.g. for wind speed >15 m/s: val>15 |
val<=thr | forecast/observed value smaller than or equal to a threshold value of the forecast parameter, e.g. for cloudiness of 0-2 okta: val<=2 |
- sample_size is number of observations used to compute the monthly mean at the given station;
- score_mean_value is the value or values of the score mean; in case of contingency table these are the 4 values delimited by comma