EFAS Post-processing

The aim of the EFAS post-processing methodology is to adjust the EFAS medium-range ensemble forecasts at specific locations, so they become better predictors of future observed river discharge value. The EFAS post-processing methodology is based on a combination of two post-processing techniques: the Model Conditional Processor (MCP; Todini, 2008) and the Ensemble Model Output Statistics (EMOS; Gneiting et al., 2005) method. The post-processed forecast is represented by a probability distribution that is dependent on observations, the simulation forced by meteorological observations, and forecasts. The output of this process is the 'EFAS Post-processed Hydrograph' (formerly this was called the 'Real-time Hydrograph') which is available in the pop-out windows of static reporting points in the Reporting Point layer where near real-time and historic river discharge observations are available. Since EFAS version 4.5, the post-processing has been performed at 6-hourly time steps where possible.

In EFAS, the post-processing is composed of two parts; the calibration (offline), and the forecast update (online).

Calibration (offline)

The offline calibration of the post-processing is performed twice a year to include the most recent observations.

Data for offline calibration

The offline calibration requires at least 2-years of river discharge observations (although longer timeseries are preferable) and the simulation forced by meteorological observations for the same time period. Where possible 6-hourly observations and simulations are used (as this allows the forecasts to be post-processed at a 6-hourly timestep in the forecast update part); daily observations are used otherwise and the simulation is aggregated to a daily timestep. For each station, the simulation comes from the most recent LISFLOOD historical run (available https://cds.climate.copernicus.eu/cdsapp#!/dataset/efas-historical?tab=overview).

All observations are provided by EFAS data providers and more information on how to provide hydrological data to EFAS is available on the EFAS website. New stations are added during the next scheduled offline calibration process.

The offline procedure has two main objectives:

1) Estimation of separate river discharge distributions for the observed and simulated river discharge values.

This estimation is performed by fitting a Generalised Pareto distribution to the extreme river discharge values and applying a kernel density estimation procedure for the remainder of the distribution (see Figure 1).

Figure 1: An example of the estimated river discharge distribution for a station from the offline calibration. Orange shows the part estimated by the Generalised Pareto distribution. Purple shows the main part of the distribution. Small black lines show the individual river discharge values. Modified from Matthews et al., 2022.

2) Estimation of a joint probability distribution of observations and simulations across multiple timesteps.

The joint probability distribution describes the relationship between observed and simulated values at different times over a 55-day period. Figure 2 shows an example of a joint distribution between 2 variables (a simulated variable (model) and an observed variable (reality)). The joint-distribution defined in the offline calibration is between 440 variables for the 6-hourly stations and 110 variables for the daily stations. The joint distributions allows a first estimate to be made of future river discharge observations give the observations and simulation from the past 40 days.

Figure 2: Representation of the joint probability distribution of observations and simulations, from Biondi, Daniela & Todini, Ezio. (2018). Comparing Hydrological Postprocessors Including Ensemble Predictions Into Full Predictive Probability Distribution of Streamflow. Water Resources Research. 10.1029/2017WR022432. https://doi.org/10.1029/2017WR022432

The distributions defined in the offline calibration are used in the forecast update part of the post-processing method. The length of the observation record and the quality of the observations can impact the accuracy of the distributions. Since EFAS version 5.0 the calibration period as well as the maximum and minimum values observed during that period are provided in a table within the EFAS Reporting Point pop-out window.

Forecast Update (online)

The online part of the post-processing method is performed for each station where the offline calibration was successful and near real-time river discharge data are available. Currently, over 1600 stations are post-processed in EFAS.

Data for forecast update step

The forecast update step requires the observations, simulation forced with observations, and EFAS ensemble forecasts (ECMWF-ENS, ECMWF-HRES, DWD-HRES, COSMO-LEPS) for the past 40 days (although some leniency is given for missing values). It also requires the current EFAS ensemble forecast (i.e., the forecast that is being post-processed). The distributions defined in the offline calibration are also required. Where possible 6-hourly observations are used and daily observations are used otherwise. However, the offline calibration and the real-time post-processing must use the same timestep.

Each day observations are extracted from the EFAS hydrological database (maintained by the CEMS Hydrological Data Collection Centre (HYDRO)) at approximately 07:00 UTC for the 00 EFAS cycle and at approximately 21:00 UTC for the 12 EFAS cycle. Therefore, any near real-time observations received after these times will only be included in the following EFAS post-processed forecasts.

The forecast update step is further split into 3 steps:

The joint probability distribution defined in the offline calibration is used to condition the river discharge distributions defined in the offline calibration on recent river discharge observations (see Fig. 3a). Using the recent river discharge values in this way restricts the forecast probability distribution to the values that are likely given the recent state of the river. This is the MCP portion of the method and it is used to correct errors and uncertainty due to the hydrological model.
The current EFAS ensemble forecast is spread corrected (see Fig. 3b). This is done by calculating the average spread correction parameter needed to match the spread of the EFAS ensemble forecasts with the root mean square error of the ensemble mean for the past 40 days. This is the EMOS portion of the method and it is used to correct uncertainty due to meteorological forcings.
Steps 1 and 2 results in two probabilistic distributions which are combined using a Kalman filter (see Fig.3) . The Kalman filter weights the two distributions from steps 1 and 2 depending on their uncertainty (or spread). This creates a probability distribution that is consistent with recent observations and influenced by the predicted meteorological forcings.
The EFAS Post-processed Hydrograph product is created (see below).

Figure 3: The forecast update part of the post-processing method uses a) the MCP method and b) the EMOS method. The output from these two methods is combined using the Kalman Filter to produce the 'EFAS Post-processed Hydrograph'.

EFAS Post-processed Hydrograph

EFAS post-processed forecasts are available for stations with at least 2 years of river discharge data and that provide near real-time river discharge observations to the CEMS Hydrological Data Collection Centre (HYDRO). The post-processed forecast is shown by the 'EFAS Post-processed Hydrograph' product shown in the pop-up window of the 'Reporting Points' layer. Stations for which the EFAS Post-processed Hydrograph is available are represented by light blue points in the Reporting Points layer.

The main panel shows the probability distribution (blue shading) for each timestep of the forecast and the recent observations (black dots). Darker blues show values closer to the forecast median. In addition to the hydrograph, the probability of exceedance of up to 6 thresholds are shown as boxplots. The two panels on the right of the main panel show the probability of exceeding the mean annual maximum flow (MHQ; top) and the mean flow (MQ; bottom) thresholds, respectively. These thresholds are calculated from observed river discharge values from the calibration period. The four lower panels show the probability of exceeding up to four thresholds provided to HYDRO by the EFAS hydrological data providers.

Thresholds

Since EFAS version 5, thresholds from data providers are included in the EFAS Post-processed Hydrograph following the suggestion from EFAS partners during a workshop at the 17th EFAS Annual Meeting. Up to four river discharge thresholds can be provided and are named in increasing order as TL1(D) to TL4(D). To provide thresholds for a station please contact HYDRO. More details on how to provide hydrological data (including thresholds) to EFAS is available on the EFAS-IS.

As thresholds are not provided for all stations the MQ and MHQ thresholds are still calculated and shown in the EFAS Post-processed Hydrograph. However, all thresholds are provided in a table shown in the pop-up window of the 'Reporting Points' layer to provide greater context for the MQ and MHQ which may differ from those calculated locally due to different calibration periods. All thresholds are updated during the offline calibration process.

Two examples are shown below, for stations Gaulfoss, Gaula in Norway (ID 1099) at 6-hourly timesteps, and Sevlievo, Rositsa in Bulgaria (ID 582) at daily timesteps.

Figure 4 - Real-time hydrographs for stations Gaulfoss (left), and Sevlievo (right).

Note on the forecast skill: The post-processed forecasts have the tendency to slightly underestimate peaks, particularly in catchments with quick hydrological response times. We are investigating improvements to the method.

Known Issues

The EFAS post-processing method is highly dependent on both the past and near real-time observed and simulated river discharge values. Issues can arise for a forecast cycle if either the observed or simulated river discharge values are much higher than those previously recorded or if an insufficient number of near real-time observations are available at the time the forecast is created.

If the ensemble forecast predicts river discharge values higher than those recorded in the EFAS historical river discharge simulation, the EFAS Post-processed Hydrograph will show as:

If the post-processed forecast predicts river discharge values higher than those recorded in the observed record made available to EFAS, the EFAS Post-processed Hydrograph will show as:

If an insufficient number of near real-time river discharge observations are made available to EFAS, the EFAS Post-processed Hydrograph will show as:

Additionally, erroneous river discharge observation in the offline calibration can cause severe errors in the EFAS post-processed forecasts. We aim to remove all erroneous observations but it is possible that some are missed. Whilst we are continuously trying to improve our quality control procedures, users are encouraged to provide feedback should they identify a station that shows large errors and we will investigate the issue.

Page tree