Contributors: Else van den Besselaar (KNMI)

Issued by: Else van den Besselaar (KNMI)

Issued Date: 16/05/2022

Table of Contents

Acronyms

Acronym

Description

ECA&D

European Climate Assessment & Dataset (https://www.ecad.eu)

E-OBS

Gridded dataset based on the station time series from ECA&D

EEA

European Environment Agency

EUMETNET

Grouping of European National Meteorological Services.

NMHS

National Meteorological and Hydrological Service

RA

Regional Association

WMO

World Meteorological Organization

TN

Daily minimum temperature

TG

Daily mean temperature

TX

Daily maximum temperature

RR

Daily precipitation amount

Introduction

Executive Summary

Indices of extremes are derived indicators that can be used to monitor the climate. Examples of these indices are the number of frost days in winter or the annual number of rainy days. Some indices have a fixed threshold and therefore only useful in certain areas, at certain times, such as the number of ice days which reflect the number of days when the maximum daily temperature is below zero. Other indices are calculated with respect to the local climate, such as the number of very heavy precipitation days, calculated as the number of days when the daily precipitation is higher than the 95th percentile for a reference period at that specific location. A large number of indices uses only temperature or only precipitation as input, while a few others require more than one variable as input. The indices dataset presented here uses the E-OBS gridded dataset as the input from which the indices are derived.

Scope of Documentation

This Product User Guide describes the indices dataset derived from E-OBS. Background information is given on why certain choices have been made and how the dataset might be affected by that. The E-OBS dataset is available from the CDS: https://cds.climate.copernicus.eu/cdsapp#!/dataset/insitu-gridded-observations-europe?tab=overview

Version History

The version of the indices based on E-OBSv23.1e is the first version available from the CDS. Differences between subsequent E-OBS versions due to increases in the number of station data and density of the network, will propagate into this indices dataset as well. The number of indices is too large to give a complete overview of what has changed between the indices versions, but comparisons between subsequent E-OBS versions are available. The PUG for E-OBS will give general information on what changes might be expected for the indices with a new version. However, the main change between versions of the indices datasets will usually be the length of the dataset.

 Data access information

Product Description

Product Target Requirements

The indices dataset will be updated once per year using E-OBS versions that cover complete calendar years.

Product Overview

Data Description

Figure 1: Anomaly in the number of ice days (days with maximum temperature below 0°C) in winter 2020 with respect to the corresponding climatology for 1981-2010. For this 'best-estimate', the ensemble median derived from 20 ensemble members is used. This figure features in the European State of the Climate 2020 report (https://climate.copernicus.eu/esotc/2020).

Table 1: Overview of key characteristics of the indices derived from E-OBS

Data Description


Dataset title

Indices derived from E-OBS

Data type

Indicators derived from gridded observations

Topic category

Climate Monitoring

Sector

Applicable to various sectors

Keyword

Climate indices

Dataset language

eng

Domain

Europe

Horizontal resolution

0.1° x 0.1°

Temporal coverage

1950-01-01/to/2020-12-31

Temporal resolution

Monthly / seasonal / half-yearly / annual / daily (depending on the index)

Update frequency

Annual

Version

v23.1e (based on E-OBSv23.1e)

Provider

Royal Netherlands Meteorological Institute (KNMI)

Terms of Use

The E-OBS-based Climate Indices as a derived dataset can be provided under the Copernicus licence.

Variable Description

Table 2: Overview and description of variables.

Variables

Long Name

Short Name

Unit

Type

Description

Number of Frost Days

FD

Days

Cold

The number of days in a given period when the daily minimum temperature is less than 0°C.

Maximum number of consecutive frost days

CFD

Days

Cold

The largest number of consecutive days in a given period when the daily minimum temperature is less than 0°C.

Number of Ice Days

ID

Days

Cold

The number of days in a given period when the daily maximum temperature is less than 0°C.

Heating degree days

HDD

°C

Cold

The heating degree days index accumulates the temperature difference between daily mean temperature TG and the threshold of 17°C (17°C−TG), for the days when the daily mean temperature drops below 17°C, for a given period.

Growing season length

GSL

Days

Cold

The number of days between the first occurrence of at least 6 consecutive days when the daily mean temperature exceeds 5°C and the first occurrence after 1 July of at least 6 consecutive days when the daily mean temperature drops below 5°C.

Minimum value of daily maximum temperature

TXn

°C

Cold

The minimum value of the daily maximum temperatures in a given period.

Minimum value of daily minimum temperature

TNn

°C

Cold

The minimum value of the daily minimum temperatures in a given period.

Percentage of cold nights

TN10p

%

Cold

The percentage of days in a given period when the daily minimum temperature is less than the 10th percentile of daily minimum temperatures for the 5-day windows centred on each calendar day in the corresponding climatological period for 1961-1990 (see Zhang et al. 2005 for further details).


Percentage of cold day-times

TX10p

%

Cold

The percentage of days in a given period when the daily maximum temperature is less than the 10th percentile of daily maximum temperatures for the 5-day windows centred on each calendar day in the corresponding climatological period for 1961-1990 (see Zhang et al. 2005 for further details).

Cold-spell duration index

CSDI*

Days

Cold

The largest number of consecutive days in a given period when the daily minimum temperature is less than the 10th percentile of daily minimum temperatures for the 5-day windows centred on each calendar day in the corresponding climatological period for 1961-1990, in intervals of at least six consecutive days.

Number of Summer days

SU

Days

Heat

The number of days in a given period when the daily maximum temperature is higher than 25°C.

Maximum number of consecutive summer days

CSU

Days

Heat

The largest number of consecutive days in a given period when the daily maximum temperature is higher than 25°C.

Number of Tropical nights

TR

Days

Heat

The number of days in a given period when the daily minimum temperature is higher than 20°C.

Maximum value of daily maximum temperature

TXx

°C

Heat

The maximum value of the daily maximum temperatures in a given period.

Maximum value of daily minimum temperature

TNx

°C

Heat

The maximum value of the daily minimum temperatures in a given period.

Percentage of warm nights

TN90p

%

Heat

The percentage of days in a given period when the daily minimum temperature is higher than the 90th percentile of daily minimum temperatures for the 5-day windows centred on each calendar day in the corresponding climatological period for 1961-1990 (see Zhang et al. 2005 for further details).


Percentage of warm day-times

TX90p

%

Heat

The percentage of days in a given period when the daily maximum temperature is higher than the 90th percentile of daily maximum temperatures for the 5-day windows centred on each calendar day in the corresponding climatological period for 1961-1990 (see Zhang et al. 2005 for further details).

Warm-spell duration index

WSDI*

Days

Heat

The largest number of consecutive days in a given period when the daily maximum temperature is higher than the 90th percentile of daily maximum temperatures for the 5-day windows centred on each calendar day in the corresponding climatological period for 1961-1990, in intervals of at least six consecutive days.

Mean of diurnal temperature range

DTR

°C

Multi

The mean value in a given period of the difference between the daily maximum temperature and daily minimum temperature (daily maximum - daily minimum).

Maximum number of consecutive dry days (daily precipitation < 1 mm)

CDD*

Days

Drought

The largest number of consecutive days in a given period when the daily precipitation is less than 1 mm.

Huglin Index for grape growth suitability HI[-]Multi

The Huglin index accumulates the mean of the temperature difference between the daily mean temperature (TG) and the threshold of 10°C  (TG - 10°C), and the temperature difference between the daily maximum temperature (TX) and the threshold of 10°C  (TX - 10°C), for the days when the daily mean temperature exceeds 10°C. The accumulation is multiplied by a latitudinal coefficient for day length, K. The period of accumulation is 1 April to 30 September. 

The value of K is determined using:

latitudeK
≤ 40°N1.00
40°N-42°N1.02
42°N-44°N1.03
44°N-46°N

1.04

46°N-48°N1.05
48°N-50°N1.06
>50°N1.00

Each specific grape variety has a specific range of Huglin Index values for which it thrives (https://en.wikipedia.org/wiki/Huglin_index).

3-Month Standardized Precipitation Index

SPI-3

[-]

Drought

SPI is a probability index based on precipitation. It is designed to be a spatially invariant indicator of drought. SPI3 refers to precipitation in the previous 3-month period (positive values indicate a wet period; negative values indicate a dry period).
For details including the algorithm, see: Guttman, N.B. (1999).

6-Month Standardized Precipitation Index

SPI-6

[-]

Drought

SPI is a probability index based on precipitation. It is designed to be a spatially invariant indicator of drought. SPI6 refers to precipitation in the previous 6-month period (positive values indicate a wet period; negative values indicate a dry period).
For details including the algorithm, see: Guttman, N.B. (1999).

Highest 1-day precipitation amount

RX1day

mm

Rain

The maximum value of the one-day precipitation amount in a given period.

Highest 5-day precipitation amount

RX5day

mm

Rain

The maximum value of the consecutive five-day precipitation amount in a given period.

Simple daily intensity index

SDII

mm/day 

Rain

The mean value of daily precipitation amount on wet days (when the daily precipitation is equal to or larger than 1 mm) in a given period.

Number of Wet days 

R1mm

Days

Rain

The number of days in a given period when the daily precipitation amount is equal to or larger than 1 mm.

Number of Heavy precipitation days

R10mm

Days

Rain

The number of days in a given period when the daily precipitation amount is equal to or larger than 10 mm.

Number of Very heavy precipitation days 

R20mm

Days

Rain

The number of days in a given period when the daily precipitation amount is equal to or larger than 20 mm.

Maximum no of consecutive wet days 

CWD*

Days

Rain

The largest number of consecutive days in a given period when the daily precipitation amount is equal to or is larger than 1 mm.

Precipitation fraction due to moderate wet days

R75pFRAC

%

Rain

The percentage of days in a given period when the daily precipitation amount is higher than the 75th percentile of the daily precipitation amounts for the 5-day windows centred on each calendar day in the corresponding climatological period for 1961-1990.

Precipitation fraction due to very wet days 

R95pFRAC

%

Rain

The percentage of days in a given period when the daily precipitation amount is higher than the 95th percentile of the daily precipitation amounts for the 5-day windows centred on each calendar day in the corresponding climatological period for 1961-1990.

Precipitation fraction due to extremely wet days

R99pFRAC

%

Rain

The percentage of days in a given period when the daily precipitation amount is higher than the 99th percentile of the daily precipitation amounts for the 5-day windows centred on each calendar day in the corresponding climatological period for 1961-1990.

Precipitation total due to moderate wet days 

R75pTOT

mm

Rain

The accumulated daily precipitation amount in a given period for the days when the daily precipitation amount is higher than the 75th percentile of the daily precipitation amounts for the 5-day windows centred on each calendar day in the corresponding climatological period for 1961-1990.

Precipitation total due to very wet days

R95pTOT

mm

Rain

The accumulated daily precipitation amount in a given period for the days when the daily precipitation amount is higher than the 95th percentile of the daily precipitation amounts for the 5-day windows centred on each calendar day in the corresponding climatological period for 1961-1990.

Precipitation total due to extremely wet days 

R99pTOT

mm

Rain

The accumulated daily precipitation amount in a given period for the days when the daily precipitation amount is higher than the 99th percentile of the daily precipitation amounts for the 5-day windows centred on each calendar day in the corresponding climatological period for 1961-1990.

Total precipitation on wet days 

PRCPTOT

mm

Rain

The accumulated daily precipitation amount on wet days (when the daily precipitation amount is equal to or larger than 1 mm) in a given period.

Mean of Reference EvapoTranspiration - Makkink

PET-MK

mm/day

Multi

The method used for Makkink Reference Evapotranspiration is a simplification of the more comprehensive Penman-Monteith parameterization and recognizes that evapotranspiration is determined primarily by the radiation and the ambient air temperature. A further simplification is that Global Radiation Q is used rather than the net radiation which is used in more comprehensive formulas. The Makkink formulation of reference evapotranspiration uses the slope of the vapour pressure-temperature relationship, the psychometric constant and an empirical coefficient to calculate the reference evapotranspiration.  Here the empirical coefficient has the value 0.65. See de Bruin (1987).


Note: The terms 'reference evapotranspiration' and 'potential evapotranspiration' are often used as synonyms. However, to be exact, the reference evapotranspiration is that from a grass surface that is well-watered and the potential evapotranspiration is that from a surface that has unlimited water (such as a lake). Despite the difference in terminology, the abbreviation PET is used for this product.

Mean of Reference EvapoTranspiration - Penman-Monteith

PET-PM

mm/day

Multi

The method of Penman-Monteith Reference Evapotranspiration is the comprehensive Penman-Monteith parameterization which includes not only the radiation and the ambient air temperature, but also recognizes that the humidity of the air and the wind speed are factors in the evaporation of moisture from the soil. The Penman-Monteith method is generally regarded as the most realistic and physically comprehensive parameterization for reference evapotranspiration and is widely used in hydrology and agriculture. This parameterization uses the slope of the vapour pressure-temperature relationship, the psychometric constant,
the density of air, the specific heat of air at constant pressure and bulk surface and aerodynamic resistances in its description. Further input is net radiation at the surface, calculated as the sum of net short wave radiation and net long wave radiation and the soil heat flux.

The reference is Allen et al. (1994).

Note: The terms 'reference evapotranspiration' and 'potential evapotranspiration' are often used as synonyms. However, to be exact, the reference evapotranspiration is that from a grass surface that is well-watered and the potential evapotranspiration is that from a surface that has unlimited water (such as a lake). Despite the difference in terminology, the abbreviation PET is used for this product.

self-calibrating Palmer Drought Severity Index

scPDSI

[-]

Multi

The Palmer Drought Severity Index (PDSI) is a measure of regional moisture availability that has been used extensively to study droughts and wet spells in the contiguous USA. The computation of the index involves a classification of relative moisture conditions within 11 categories as defined by Palmer (1965), ranging from extremely dry with PDSI ≤ −4, to extremely wet with PDSI values ≥ 4. The index is based on water supply and demand which is calculated using a water-budget system based on historic records of precipitation and temperature and the soil characteristics of the site being considered.

The self-calibrating PDSI (scPDSI) as put forward by Wells et al. (2004) is more appropriate for geographical comparison of climates of diverse regions. Wells et al. (2004) improve the performance of the PDSI by automating the calculations Palmer made when he derived the empirical constants used in the PDSI algorithm.

Universal Thermal Climate Index

UTCI

°C

Multi

The Universal Thermal Climate Index (UTCI), introduced by Jendritzky et al. (2012), is based on a multi-node model of human thermoregulation and can be viewed as an equivalent temperature. For any combination of air temperature, wind, radiation, and humidity (stress), UTCI is defined as the isothermal air temperature of the reference condition that would elicit the same dynamic response (strain) of the physiological model.


*For the calculation of spells (CSDI, WSDI, CDD, CWD) the spells are cut-off at the end of the calendar year. This might interfere with the applications where the continuation of a spell into the next calendar year is relevant. For these applications, additional datasets are provided with the names: altCSDI, altWSI, altCDD, altCWD.

Input Data

The input data for the indices described here is the E-OBS dataset. E-OBS is a daily gridded land-only observational dataset over Europe. The blended time series from the station network of the European Climate Assessment & Dataset (ECA&D) form the basis for the E-OBS gridded dataset. All station data are sourced directly from the European National Meteorological and Hydrological Services (NMHSs) or other data holding institutions. For a considerable number of countries, the number of stations used is that from their complete national networks, which are therefore much denser than the station networks that are routinely shared among NMHSs (which are the basis of other gridded datasets). The density of stations gradually increases, with the lowest number of stations available in the 1950s, to much higher numbers in more recent periods. The meteorological station dataset which is the basis of the E-OBS dataset is not static - for both the historical period and the most recent period, sometimes new stations are added with new releases of E-OBS, through collaborations with NMHSs.

Initially, in 2008, this gridded dataset was developed to provide validation for the suite of Europe-wide climate model simulations produced as part of the European Union ENSEMBLES project. While E-OBS remains an important dataset for model validation, it is also used more generally for monitoring the climate across Europe, particularly with regard to the assessment of the magnitude and frequency of daily extremes.

The position of E-OBS is unique in Europe because of the relatively fine spatial horizontal grid spacing, the daily resolution of the dataset, the provision of multiple variables and the length of the dataset. Finally, the station data on which E-OBS is based are available through the ECA&D webpages (where the owner of the data has given permission to do so). In these respects, it contrasts well with other datasets.

The temporal resolution of the dataset is daily, meaning the observations cover 24 hours per time step. The exact 24-hour period can be different per region. The reason for this is that some data providers measure from midnight to midnight while others might measure from morning to morning. Since E-OBS is an observational dataset, no attempts have been made to adjust time series for this 24-hour offset. However, it is made sure, where known, that the largest part of the measured 24-hour period corresponds to the day attached to the time step in E-OBS (and ECA&D).

Methodology and uncertainty estimate

The best-estimate for the various climate impact indices, like the E-OBS-based temperature and precipitation indices, is given as the ensemble median of the dataset which is constructed by calculating the indices using all of the ensemble members of the E-OBS dataset. Where users require a single measure of the E-OBS-based indices, then this 'best-estimate' value should be used. Nevertheless, the general recommendation relating to uncertainty is that users ought to consult the uncertainty information as this can be considerable and will change in space and time. The uncertainty is ultimately determined by the station coverage which varies across the domain and in time.

To address uncertainty in the temperature and precipitation indices, all 20 E-OBS ensemble members are used to calculate an ensemble of indices, propagating the spread in the E-OBS ensemble to a spread in the indices. The size of this spread is captured by the values for the 2.5 and 97.5 percentiles of the indices ensemble. Although the full E-OBS ensemble will be made available, the full ensemble of the indices is not made available.
The spatial maps of the 2.5 and 97.5 percentiles of the indices ensemble and how these vary in time for the various aggregation levels (annual, half-yearly, seasonal, monthly) are made available as two variables in a netcdf file. The ensemble median is available in a separate file.

For indices that use multiple elements as input (e.g. Potential EvapoTranspiration (PET), Universal Thermal Climate Index (UTCI) and self-calibrating Palmer Drought Severity Index (scPDSI)), only the best-estimate is available (daily values for PET and UTCI, monthly values for scPDSI). These indices are derived using the E-OBS ensemble mean rather than the individual ensemble members as propagating the ensemble uncertainty in derived indices based on multiple input streams - each with their own ensemble spread - quickly becomes computationally prohibitive.

An example of the use of the available uncertainty files is given in Figure 2, where a map of the 100-member ensemble-median of the number of rainy days for November 2011 is shown, with a large area in Central Europe showing no rainy days at all, and a time series of the number of rainy days, averaged over the Danube catchment, for the ensemble median (in red) and the ensemble spread quantified by the 2.5th and 97.5th percentiles (in grey). This latter plot clearly shows the low value for November 2011 - even when the spread in the ensemble is taken into account - but it also shows the considerable spread for more typical months or wetter months, like April 2010. Note that the median (the red line in Figure 2) is expected to be nearer the 2.5th percentile value as opposed to the 97.5th percentile value. This happens more for precipitation-based indices as the data are much more skewed than for temperature.

Figure 2: Map showing the ensemble median of the R1mm index (number of rainy days) for November 2011 (a) and a time series of the number of rainy days averaged over the Danube basin (b). The latter plot shows the ensemble median value (red line) and the spread in the ensemble (grey shading) as quantified by the 2.5th and 97.5th percentile values as provided in the netcdf files. These figures are based on the 100-ensemble member version of E-OBSv18.0e.



Why not use all E-OBS ensemble members?

The range of uncertainty in earlier versions of the E-OBS derived indices (version 19.0e and earlier) was described by a 100 member ensemble. However, it was observed that the uncertainty in derived indices saturated at a much lower number of ensemble members and it turned out to be possible to use fewer ensemble members to reliably estimate the uncertainty in the indices of extremes. From these experiments, it was decided that a 20 member ensemble is sufficient to span the uncertainty and from then on, only 20 ensemble members were produced for the E-OBS indices. The uncertainty saturated at a much lower number of ensemble members, as we try to explain here. As these experiments were performed before the E-OBS indices were available through the CDS, the text and figures of this Product User Guide that cover the reasons why only 20 ensemble members were chosen, are based on an older E-OBS indices version instead of the latest version. Note, that from E-OBSv24.0e onwards, the E-OBS dataset itself has been created with only 20 ensemble members instead of the earlier 100 ensemble members.

In the following subsection, the motivation for moving to the use of a smaller 20-member ensemble for the indices is documented.
In order to assess how the spread in the E-OBS ensemble propagates to a spread in derived data, climate impact indicator data from 17 selected grid boxes across Europe are selected. For temperature, the six indices looked at are:

  • TNn (minimum value of the daily minimum temperature)
  • TXx (maximum value of the daily maximum temperature)
  • TN10p (cold nights, percentage of days with daily minimum temperature below the 10th percentile of daily minimum temperatures)
  • TX90p (warm day-times, percentage of days with daily maximum temperature above the 90th percentile of daily maximum temperatures)
  • WSDI (warm spell duration index)
  • CSDI (cold spell duration index)

For precipitation the three indices looked at are:

  • R95pTOT (precipitation fraction due to very wet days (exceeding the 95th percentile))
  • RX5day (maximum 5-day accumulated precipitation sum)
  • CDD (Consecutive Dry Days)

The motivation to look into these indices is that they are expected to show maximum spread, given a spread in the input temperature or precipitation data, because of their focus on the extreme end of possible events.

Spread in temperature indices

Figure 3 shows the 2.5th and the 97.5th percentiles for annual TN10p, for the grid boxes in the E-OBS dataset nearest to Munich (Germany) and Kiev (Ukraine). Together, these two percentiles capture 95% of the spread in TN10p and so provide a measure of that spread. It can be seen that this spread is about the same whichever set of 20 E-OBS ensemble members is used. This illustrates the rapid saturation of the spread with ensemble size. Figure 3 also shows that the spread in TN10p is modest (as we will see later with examples showing TX90p as well). The relatively narrow spread is related to the use of 10% of the values in the temperature distribution, which is apparently not sufficiently extreme to maximize the spread.

Figure 3: The 2.5th (a & b) and 97.5th (c & d) percentiles for the spread in annual TN10p (percentage of cold nights) for the grid boxes in the E-OBS dataset nearest to Munich (Germany) (a & b) and Kiev (Ukraine) (c & d). The colour coding relates to the use of five different subsets of 20 members from the 100-member E-OBS ensemble.


The spread in the ensemble is larger using the TNn index, showing the spread in estimates of the coldest night in the year (Figure 4). Such a relatively large spread in the ensemble is observed for TXx (warmest daily temperature of the year) for grid boxes close to Perugia (Italy) and Brno (Czech Republic) as well (Figure 5). There appears to be slightly more diversity between the 20-member sets than for those in Figure 3. Figure 6 shows the 2.5th (top) and 97.5th (bottom) percentiles for TX90p (percentage of warm day-times) for the grid boxes in the E-OBS dataset nearest to Perugia (Italy) (left) and Brno (Czech Republic) (right).

Similarly to the plots for the coldest nights (Figure 3), the spread is modest.

The Warm Spell Duration Index (and the Cold Spell Duration Index - not shown), as shown in Figure 7, has more spread. There are, as is to be expected, quite a few years having no warm or cold spells.

Figure 4: The 2.5th (a & b) and 97.5th (c & d) percentiles for the spread in annual TNn (minimum of the daily minimum temperature - coldest night) for the grid boxes in the E-OBS dataset nearest to Munich (Germany) (a & c) and Kiev (Ukraine) (b & d). The colour coding relates to the use of five different subsets of 20 members from the 100-member E-OBS ensemble.

Figure 5: The 2.5th (a & b) and 97.5th (b & d) percentiles for the spread in annual TXx (maximum of the daily maximum temperature - warmest day-time) for the grid boxes in the E-OBS dataset nearest to Perugia (Italy) (a & c) and Brno (Czech Republic) (b & d). The colour coding relates to the use of five different subsets of 20 members from the 100-member E-OBS ensemble.

Figure 6: The 2.5th (a & b) and 97.5th (c & d) percentiles for the spread in annual TX90p (percentage of warm day-times) for the grid boxes in the E-OBS dataset nearest to Perugia (Italy) (a & c) and Brno (Czech Republic) (b & d). The colour coding relates to the use of five different subsets of 20 members from the 100-member E-OBS ensemble.

Figure 7: The 2.5th (a & b) and 97.5th (c & d) percentiles for the spread in annual WSDI (Warm Spell Duration Index) for the grid boxes in the E-OBS dataset nearest to Munich (Germany) (a & c) and Brno (Czech Republic) (b & d). The colour coding relates to the use of five different subsets of 20 members from the 100-member E-OBS ensemble.

Spread in precipitation indices

Figure 8 shows the 2.5th percentile and the 97.5th percentile for annual R95pTOT (precipitation fraction due to very wet days (exceeding 95th percentile)) for the grid boxes in the E-OBS dataset nearest to the city centre of Linköping (Sweden) and Porto (Portugal). This figure shows that for precipitation, as for temperature, a rapid saturation of the spread with ensemble size is present.

 

Figure 8: The 2.5th (a & b) and 97.5th (c & d) percentiles for the spread in annual R95pTOT (precipitation amount related to very heavy precipitation days) for the grid boxes in the E-OBS dataset nearest to Linköping (Sweden) (a & c) and Porto (Portugal) (b & d). The colour coding relates to the use of five different subsets of 20 members from the 100-member E-OBS ensemble.


Figure 9, which shows RX5day (annual maximum of the 5-day accumulated precipitation amount), shows a similarly rapid saturation of the spread with the increase of the number of ensemble members. This figure, as well as Figure 8, shows a strong correlation between the 2.5th and 97.5th 'trends' for R95pTOT and RX5day.

Figure 9. The 2.5th (a & b) and 97.5th (c & d) percentiles for the spread in annual RX5day (5-day precipitation amount) for the grid boxes in the E-OBS dataset nearest to Linköping (Sweden) (a & c) and Porto (Portugal) (b & d). The colour coding relates to the use of five different subsets of 20 members from the 100-member E-OBS ensemble.

Figure 10 shows CDD (Consecutive Dry Days) for the grid boxes close to Norwich (UK) and Murcia (Spain). CDD has hardly any spread for the 2.5th percentile but this is to be expected. Assessments of the annual maximum length of a dry spell will vary strongly when for a day in this dry spell an ensemble member indicates rainfall. This will make estimates of the length of the longest dry spells vary more wildly, whereas the estimates of the annual maximum length at the low-end of the spectrum will consist of those days for which all members show an absence of rainfall.

 


Figure 10: The 2.5th (a & b) and 97.5th (c & d) percentiles for the spread in annual CDD (consecutive dry days) for the grid boxes in the E-OBS dataset nearest to Norwich (United Kingdom) (a & c) and Murcia (Spain) (b & d). The colour coding relates to the use of five different subsets of 20 members from the 100-member E-OBS ensemble.

In order to look a little deeper into the loss of information when a 100-member ensemble is replaced by a 20-member ensemble set, Figure 11a shows the median values of R95pTOT for the various sets of 20-member ensembles and (in grey) the median value based on the full 100-member ensemble, for the period 1990-2018. Clearly, the correlation between the medians is large, except for a few years when variations between the 20-member sets and the 100-member ensemble can be substantial.

The ratio between the median of the 20-member sets and the median of the full 100-member set (Figure 11b) shows that for some years a 20-member median value may underestimate the 100-member median by as much as 20%. A similar analysis is made for the uncertainty in the gridding. Figures 11c and 11d show the ratio between the 2.5th (c) and 97.5th (d) percentiles based on the various 20-member ensembles and those percentiles based on the full 100-member ensemble. For the 2.5th percentile, this ratio can be as large as 1.5. For the upper percentile, this ratio is smaller, as can be expected.


Figure 11: The R95pTOT medians of the various 20-member ensembles and the full 100-member ensemble set (in grey) (a & b). (a) shows the medians for the 1990-2018 period and (b) shows the ratios of the medians between the 20-member sets and the full 100-member set. (c & d) show similar ratios, but for the 2.5th percentile (c) and the 97.5th percentile (d). This figure relates to a grid box close to Munich.

Using indices for trend analysis

A strong argument for using climate indices stems from their ability to reveal changes in climate, thereby enabling a means of climate monitoring. The inspection of the trends in these indices is part of this monitoring. The calculation of trends based on the ensemble median of the indices (which is provided) is a good starting point. However, the quantification of the uncertainty in the trend, due to the uncertainty in the gridding of the underlying temperature or precipitation fields, by calculating trends in the 2.5th and 97.5th percentiles does not fully capture this uncertainty. Nevertheless, calculating trends using these percentiles (which are provided) should provide an estimate of the uncertainty of the trends, in the absence of the assessment of uncertainty based on calculating trends on each of the ensemble members of the climate indices. This consideration may be an argument to provide full ensembles of the indices in the future. However, the practical consequences of having to analyse an ensemble of trends, each having its own uncertainty as well, are large since these different type of uncertainties need to be combined. For these type of issues, it is suggested that a Bayesian approach may be useful as the output of one model is uncertain, which is then input to the next model. Bayesian inference expresses uncertainty with posterior distributions (not p values). The many distributions from the complete ensemble can be combined in to one. The keyword is 'Bayesian predictive distribution', sometimes simply 'predictive distribution', an elegant concept to translate uncertainty from one model into another (Hoff, 2009). The challenge left is then to conduct trend analysis Bayesian style. Hoff (Chapter 5, 2009) shows how to do linear regression Bayesian style.

Concluding Remarks

In the figures above, it is shown how the the ensemble-based uncertainty estimates of the E-OBS dataset propagate into the climate impact indices. All of the extreme indices, both for temperature and precipitation, show a rapid saturation of the ensemble spread with increasing ensemble size. This means that, for these type of indices, using a 100-member ensemble of E-OBS to assess uncertainty in the indices is generally not required. It has been found that an overall valid estimate of the uncertainty can be obtained by reducing the number of ensemble members to about 20. Note, that the smaller the number of values used to calculate the extremes (like TXx or RX1day, which show values for one day only), the greater the differences between ensemble members are likely to be. Also note, that for individual years in the smaller ensembles, a substantial difference with a 100-member ensemble may exist, both in the median but more so in the estimates of the spread of the ensemble. Note, that from E-OBSv24.0e onwards, only 20 members are created for the E-OBS dataset itself instead of the earlier 100 members.

Earlier work (Cornes et al. 2018) showed that the uncertainty quantified by the ensemble spread appears to be an underestimation of the real uncertainty, especially in data sparse regions, although the uncertainty in the current E-OBS dataset is more closely related to station density than uncertainty values in the original data set (Haylock et al. 2008). The modest range of values spanned by the ensemble and the necessity to use the more extreme indices to show a reasonable uncertainty relate to this. Improvements may be made in this estimate through the quantification of other sources of uncertainty such as instrumentation error (Cornes et al., 2018, Yang et al., 1999).

The underestimation of uncertainty may be related to the use of a single variogram across the domain, as an assumption is made that the correlation structure is only dependent on the spatial lag, and not the location. This assumption is often made in such applications (e.g., Haylock et al., 2008; Newman et al., 2015) but can be an oversimplification for continental-scale data as the spectral characteristics of the true fields can vary considerably across the domain, particularly for precipitation. This assumption of stationarity may also be unrealistic in the present application on account of the varying characteristics of the spatial trend captured by the Generalized Additive Model, which results from the large variations in station coverage across the domain. An approach to remedy this is to use different variograms across the domain; this, however, introduces artefacts in the gridded data, and particularly in the ensemble spread and this approach is no longer pursued (Cornes et al. 2018). A possible way forward is to use approaches explored from non-stationary co-variance modelling by Douglas Nychka (e.g. Nychka et al., 2018).

Nevertheless, the main recommendation relating to uncertainty is that users ought to use results from all 20 members as the uncertainty will vary in space and time, reflecting the changing station density. Although the spread in the uncertainty estimate of the E-OBS dataset is probably too small, the results shown in this report clearly indicate that (even for this modest ensemble of twenty) the spread in the climate indices can be considerable.

Appendix I

Input Data Description

E-OBS dataset

Table 3: Overview of key characteristics of E-OBS as used to derive the E-OBS indices

Data Description

Main variables

Daily maximum temperature (°C), daily mean temperature (°C), daily minimum temperature (°C), daily precipitation amount (mm), daily mean relative humidity (%), Daily surface shortwave downwelling radiation (W m-2)

Domain

Europe

Horizontal resolution

0.1° x 0.1°

Temporal coverage

1950-01-01/to/2020-12-31

Temporal resolution

Daily

Update frequency

Half-yearly (but yearly updates for the E-OBS indices)

Version used

E-OBSv23.1e (not necessarily the latest E-OBS version)

Provider

Royal Netherlands Meteorological Institute (KNMI)

References

https://www.ecad.eu/documents/ECAD_datapolicy.pdf

Allen, R. G., Smith, M., Perrier, A., & Pereira, L. S. 1994: An update for the definition of reference evapotranspiration. ICID bulletin, 43(2), 1-34.

Cornes, R., G. van der Schrier, E.J.M. van den Besselaar, and P.D. Jones. 2018: An Ensemble Version of the E-OBS Temperature and Precipitation Datasets, J. Geophys. Res. Atmos., 123. doi:10.1029/2017JD028200

De Bruin, H.A.R. 1987: From Penman to Makkink. In: Evaporation and Weather (C. Hooghardt, editor), Commission for Hydrological Research TNO, The Hague, the Netherlands, Proceedings and Informations 39, pages 5-31

Guttman, N.B. 1999: Accepting the standardized precipitation index: A calculation algorithm, J. Amer. Water Resources Assoc., 35 (2): 311-322.

Haylock, M. R., N. Hofstra, A. M. G. Klein Tank, E. J. Klok, P. D. Jones and M. New et al. 2008: A European daily high-resolution gridded data set of surface temperature and precipitation for 1950-2006. J. Geophys. Res. (Atmospheres), doi:10.1029/2008JD010201

Hoff, P. D. (2009) A First Course in Bayesian Statistical Methods, Springer-Verlag, New York, NY, doi:10.1007/978-0-387-92407-6

Jendritzky, G., de Dear, R., & Havenith, G. 2012: UTCI - why another thermal index?. International journal of biometeorology, 56(3), 421-428

Newman, A. J., Clark, M. P., Craig, J., Nijssen, B., Wood, A., Gutmann, E., et al. (2015). Gridded ensemble precipitation and temperature estimates for the contiguous United States. Journal of Hydrometeorology, 16(6), 24812500. doi:10.1175/JHM-D-15-0026.1

Nychka, D., D. Hammerling, M. Krock, and A. Wiens, (2018). Modeling and emulation of nonstationary Gaussian fields. Spatial Statistics, 28, 21-38, doi:10.1016/j.spasta.2018.08.006.

Palmer, W. C., 1965: Meteorological drought, Weather Bureau Research Paper No. 45, US Department of Commerce, Washington DC.

Wells, N., Goddard, S., and Hayes, M. J. 2004: A self-calibrating Palmer drought severity index. Journal of Climate, 17(12), 2335-2351.

Yang, D., Elomaa, E., Tuominen, A., Aaltonen, A., Goodison, B., Gunther, T., et al. (1999). Wind-induced precipitation undercatch of the Hellmann gauges. Hydrology Research, 30(1), 5780.

Zhang X, Hegerl GC, Zwiers FW. (2005). Avoiding inhomogeneity in percentile-based indices of temperature extremes. J Climate 18, 1641–1651.

This document has been produced in the context of the Copernicus Climate Change Service (C3S).

The activities leading to these results have been contracted by the European Centre for Medium-Range Weather Forecasts, operator of C3S on behalf of the European Union (Delegation Agreement signed on 11/11/2014 and Contribution Agreement signed on 22/07/2021). All information in this document is provided "as is" and no guarantee or warranty is given that the information is fit for any particular purpose.

The users thereof use the information at their sole risk and liability. For the avoidance of all doubt , the European Commission and the European Centre for Medium - Range Weather Forecasts have no liability in respect of this document, which is merely representing the author's view.

Related articles