As part of the EFAS v5 upgrade the hydrological post-processing calibration procedure was performed in June 2023 with the EFAS v5 hydrological reanalysis (water balance) and observations up to 6 June 2023. A total of 1433 stations were calibrated at 6-hourly timesteps and a further 147 were calibrated at daily timesteps. This calibration process included the newly implemented calibration evaluation script. This evaluation creates an evaluation report for each of the successfully calibrated stations. This page summarises the results of the evaluation of this calibration for all stations.

Calibration Methodology

The station models calibrated in the hydrological post-processing (HPP) calibration process contain the information necessary for the MCP component of the post-processing methodology. The MCP component is responsible for correcting errors due to the initial conditions and the hydrological model. A full description of the post-processing methodology can be found on the CEMS-Wiki here but for the purpose of keeping this page self-contained the following two key points should be noted:

1) The MCP creates a naïve "first guess" forecast for the next 15 days by conditioning the climatological distribution of the observed and simulated discharge on the recent observations and simulations values. This naïve forecast therefore only considers the autocorrelation of the timeseries and the recent state of the river, and does not include any information on the upcoming meteorological situation. The naïve forecast of the observations is referred to below as the Obs MCP. 

2) (After being spread-corrected) The EFAS ensemble forecasts are assimilated into these naïve forecasts to incorporate the meteorological information. This is done using a Kalman Filter so the resulting distribution is dependent on the distribution of the ensemble forecast, the distribution of the naïve forecast for the simulation (water balance), and the cross-covariance matrix between the water balance and the observations. The resulting forecast is referred to below as the Obs Full forecast.

Given these two points of the methodology we attempt to evaluate the Obs MCP forecast and the Obs Full forecast. The Obs MCP can be thought of as a lower threshold of the potential skill of the post-processed forecasts. Whilst there are situations where the skill for a specific forecast may be lower (e.g., ensemble forecast introduces a fictional flood event or an event is missed, erroneous observations are included in the process) the post-processed forecasts will in general be at least as good as or of higher skill than the Obs MCP. The Obs Full is calculated here using the water balance as the ensemble forecast. Within the post-processing method the ensemble forecast is considered a forecast of the simulation and not the observations (i.e., it's forecasting the water balance). Therefore, the perfect ensemble forecast would be the water balance itself (deterministic and accurate). As we are assimilating a "perfect forecast", the Obs Full is the post-processing equivalent of the hydrological model skill layer. Therefore, Obs Full can be considered an upper skill threshold although there may be occasions when the post-processed forecast performs better due to the interaction between the uncertainty of the ensemble forecast and the Obs MCP.

Calibration Results

In the following results the extremely poorly performing stations were removed and are being analysed separately.

Assessment of the CRPS skill score

The CRPSS is shown in Fig. 1 for the Obs MCP (left) and Obs Full (right) forecasts. The CRPS evaluates the full forecast distribution. The benchmark is the water balance and the "truth" value are the observations. Over 75% of stations show an improvement compared to the water balance even when no meteorological information is included (Obs MCP). This is mainly due to bias correction. When the meteorological information is assimilated (Obs Full) all stations show an improvement with 50% of stations showing reduction in error (in terms of the CRPS) of 50% even at lead-times of 15 days. 

Figure 1: CRPSS (benchmark is the water balance) for the Obs MCP (upper panel; blue boxes) and the Obs Full (lower panel; red boxes) forecasts

Assessment of the KGE skill score

The modified KGESS is shown in Fig. 2 for the forecast median of the Obs MCP (left/ upper, blue boxes) and Obs Full (right/ lower, red boxes) forecasts. Again, the benchmark is the water balance and the "truth" value are the observations. Improvement over the water balance is only shown by 50% of stations at 3.5 days lead-time for the Obs MCP. This is unsurprising as no meteorological information is included and the information provided by the recent observations is limited at longer lead-times. When the meteorological information is assimilated (Obs Full) over 50% of stations show an improvement in terms of the modified KGE at a lead-time of 15 days. The operational post-processed forecasts are likely to fall somewhere between the skill of the Obs MCP and the Obs Full forecasts depending on the accuracy and confidence of the ensemble forecast. There are some stations where the degradation of the modified KGE is very bad.

Figure 2: KGESS (benchmark is the water balance) for the forecast median of the Obs MCP (left/upper panel; blue boxes) and the Obs Full (right/lower panel; red boxes) forecasts.

Assessment of the KGE components

The following figures show the correlation, bias ratio, and variability ratio for the Obs MCP, Obs Full, and the simulation (water balance) to identify where the loss in KGE is occurring.

Figure 4 shows that the Obs Full has the best correlation at all lead-times whereas the Obs MCP has the worst (unsurprising as the Obs MCP will tend to climatology at longer lead-times). For the first two days, when recent observations are informative, the Obs MCP has a higher correlation than the water balance. 

Figure 4. Correlation between the observation and the Obs MCP (blue), Obs Full (orange) and the simulation/ water balance (green).

In general the bias ratios of the water balance vary the most across stations (Figure 5). However, the Obs MCP has the worst bias ratio at all lead-times above 7 days. The Obs Full has a much smaller range of bias ratios but underestimates the flow at almost all stations. In general the Obs Full has the better bias ratios but at longer lead-times this may not be the case for all stations. 

Figure 5. Bias ratio between the observation and the Obs MCP (blue), Obs Full (orange) and the simulation/ water balance (green).

The water balance has much better variability ratios at all lead-times above 3 days than either the Obs MCP or the Obs Full (Figure 6). This is because the Obs MCP and Obs Full are tending towards climatology more. It should be noted that in the uncertainty in the ensemble forecasts may reduce the variability ratio of the operational post-processed forecasts closer to the Obs MCP values than the Obs Full values but the variability ratios of the ensemble forecasts will also likely be lower than of the water balance. 

Figure 6. Variability ratio between the observation and the Obs MCP (blue), Obs Full (orange) and the simulation/ water balance (green).


Conclusions

Based on the results above, the Hydrological Post-Processing improves the behaviour of the EFAS simulations compared with the raw model outputs but care should be taken for extreme values that exceed those seen in the calibration period .