EFAS v5.2 - updates

Introduction

The EFAS v5.2 upgrade features a revision of the flood notification criteria for riverine flooding, i.e. Formal and Informal Notifications.

The EFAS modelling and forecasting chain has undergone several upgrades in the last years, but the notification criteria has been unchanged over the last decade. Therefore, a thorough statistical analysis was performed to identify the notification criteria that optimises the trade-off between correctly detected events, false alarms, and misses. This analysis was performed leveraging on EFASv4 simulations.

The main results are:

A new methodology to combine into a grand ensemble the EFAS forecasts forced by the 4 different Numerical Weather Prediction (NWP) models. As a result of the combination of the NWPs, we obtain a total probability of exceeding a given return period that is more robust than analysing individually the exceendance probabilities of each NWP alone.
A revision of the notification criteria based on the above grand ensemble that involves a new total probability threshold, the removal of the persistence criterion, and a new minimum catchment area.

This page provides:

The definition of the EFASv5.2 criteria for EFAS Formal and Informal notifications, as well as the definition of the area of validity of EFAS Formal and Informal notifications.
An overview of the updates to the EFAS Map Viewer.
A summary of the scientific analysis that led to the definition of EFASv5.2 criteria for EFAS Formal and Informal notifications.

New EFAS Formal and Informal Notification Criteria

EFAS v5.2 formal flood notification criteria are:

Catchment part of Conditions of Access (CoA), i.e., within the EFAS partner region.
A catchment area of at least 1000 km².
A total probability of exceeding the EFAS 5-year return period of at least 50%.
Event at least 48 hours in advance with respect to forecast time. More specifically, the onset of the event will occur between 2 and 7 days from the forecast time. The onset of the event is the first time step when the probability criterion in point two is fulfilled. E.g., for the forecast 2024-08-13 00UTC, the event must start between 2024-08-15 00UTC and 2024-08-20 00UTC

Note: persistency is no longer used.

EFAS v5.2 informal flood notification criteria are:

Catchment part of Conditions of Access (CoA), i.e., within the EFAS partner region.
Complete set of criteria for formal notifications is not met, but the forecaster deems that the authorities should be informed.
Minimum requirements for catchment area and total probability value:
1. A catchment area of at least 500 km².
2. A total probability of exceeding the EFAS 5-year return period of at least 40%.

Definition of the area of validity of an EFAS formal/informal notification: An EFAS notification is valid for the upstream and downstream river stretch of the same name that exceeds the 50% total probability threshold (40% for informal notifications) shown in the flood probability layer in the same country.

EFAS v5.2 - Updates to visualisations

As a consequence of the new combination method and the new notification criteria, the following EFAS products have been changed:

The layers previously called Flood Probability < 48 h and Flood Probability > 48 h have been renamed as 5-year exceedence < 48 h and 5-year exceedence > 48 h, respectively. These layers show the total probability of exceeding the 5-year return period according to the latest available forecast. The layer corresponding to lead times larger than 48 h is used to issue formal flood notifications.
The Flood probability persistence layer has been renamed as Flood probability. The new layer shows the total probability of the latest available forecast of exceeding three return period thresholds (2, 5 and 20 years) computed as a skill-based combination of all available NWPs at each time step.
The Reporting Points is now based on the total probability. The label shown at each point indicates only one value: the total probability of exceeding the 5 year return period. The arrow indicates the tendency (increasing, constant, decreasing) of the total probability between the current and the previous forecasts for the point. Some minor changes affect the pop-up window:
- A new grand ensemble hydrograph that combines the forecasts of all 4 NWPs.
- The total probability shown in the forecast overview table follows the new definition of total probability.
- A new group of forecast persistency table, which shows the total probability computed for each return period threshold for the last 6 issued forecasts.

EFASv4 skill assessment

Introduction

The definition of the new notification criteria is based on an analysis of EFASv4 discharge simulations in the period from October 2020 to June 2023 at the 1979 EFAS fixed reporting points with a minimum catchment area of 500 km². For this period, we considered EFASv4 reanalysis (water balance) as the ground truth, and we used the EFASv4 forecasts to search for the optimal notification criteria. In particular, we looked for answers to the following questions:

What's the best approach to blend the EFAS forecasts forced by the 4 NWPs into a single grand ensemble?
In the grand ensemble context:
- What is the optimal exceedance probability threshold?
- Is the persistence criterion still meaningful?
- Can we issue notifications at smaller catchments?

Methods

The analysis consisted on computing the events that would have been notified for different combinations of the notification criteria, and evaluating the skill of each of those combinations by comparison against the flood events in the reanalysis (exceedances over the EFAS 5-year return period).

All possible combinations of the following criteria were tested:

Method to combine the NWPs into a grand ensemble:
- The procedure used in EFASv4 that evaluates independently deterministic and probabilistic NWPs (thereafter referred to as "1 deterministic + 1 probabilistic").
- The simple model average (thereafter referred to as "model mean"). In this approach every NWP gets the same weight, no matter whether its a deterministic or probabilistic model.
- A model average weighted by the number of members of the model (thereafter referred to as "member weighted"). In this approach probabilistic NWPs prevail over the deterministic counterparts.
- A model average weighted by the probabilistic skill (Brier score) of each model at each lead time (thereafter referred to as "brier weighted"). In this approach models with higher skill predominate in the grand ensemble.
Exceedance probability thresholds ranging from 5% to 95% at 2.5% steps.
Persistence values from no persistence (1/1), 2 forecasts out of the last 4 (2/4), 2/2, 3/4 and 3/3. The persistence criterion was used in EFASv4 to avoid notifications caused by erratic NWP behaviour, when one forecast might be very different from the previous.

As a measure of notification skill we used the f-score, a metric widely used for imbalanced classification tasks like the one at hand. The f-score is a combination of recall and precision, where recall is the ratio of actual events correctly identified, and precision is the ratio of notifications that are correct.

\[ \text{recall} = \frac{\text{hits}}{\text{hits} + \text{misses}} \] \[ \text{precision} = \frac{\text{hits}}{\text{hits} + \text{false alarms}} \] \[ f_{\beta} = \left( 1 + \beta^2 \right) \frac{\text{precision} \cdot \text{recall}}{\beta^2 \cdot \text{precision} + \text{recall}} \]

The beta coefficient in the f-score allows to assign higher importance to either precision (beta smaller than 1) or recall (beta larger than 1). In our analysis, we tried to minimize the amount of false alarms by using a value of 0.8.

Results

Figure 1 summarizes in one image the results of the skill assessment. It exhibits for each combination of NWPs the evolution of the f-score depending on the probability threshold and the persistence criterion. As a reference, the black cross indicates the skill of the EFASv4 flood notification criteria.

Figure 1. Evolution of the notification skill with probability threshold and persistence for the different combinations of NWP.

Each plot represents a different combination of NWP. As a benchmark, the black cross represents the skill of the EFASv4 notification criteria.

This figure answers the first three questions posed at the beginning of this post:

There are two NWP combination methods that stand out as the best performing: member weighted and Brier weighted. Member weighted allocates weights to each NWP based on its number of members, whereas Brier weighted does the allocation based on skill. It turns out that ECMWF-ENS is both the model with a larger amount of members and that with best skill, reason why the performance of these two approaches is very similar. However, the Brier weighted approach is scientifically more sound, as it assigns weights based on quality instead of quantity. For this reason, this Brier weighted approach was selected as the method to generate the grand ensemble.
In all NWP combinations, the highest performance is yielded when persistence is not included as a notification criterion. The intuition is that the grand ensemble hides the erratic behaviour of deterministic NWPs, which was the reason for including this criterion in previous versions. Therefore, persistence is not a flood notification criterion in this new release.
The optimal probability threshold is 50%. The 30% probability threshold proved to be a correct choice when using a persistence of 3 forecasts. However, removing persistence affects the optimal probability threshold, which needs to increase to reduce the amount of false positives.

There remains one of the original questions to be answered: whether the minimum catchment area limit can be reduced. Figure 2 shows that the notification skill increases with catchment area (as it was expected), but it does not worsen dramatically when moving from the current 2000 km² limit to 1000 km². Actually, the skill of the new notification criteria at 1000 km² is better than the previous criteria at 2000 km². We expect that the skill at smaller catchments of EFASv5 is better that EFASv4, given the increase in spatial resolution, but this is to be assessed when long enough EFASv5 forecasts will be available.

Figure 2. Evolution of skill with catchment area limit.

The black line represents the notification criteria in EFASv4, while the colour lines depict the skill of the optimized criteria for the 4 combination methods.

The vertical, dotted line indicates the 2000 km² catchment area limit in EFASv4.

A last change has been introduced in the new notification criteria: the maximum lead time of 7 days. We analysed the evolution of skill with increasing lead time (Figure 3) and saw that skill degrades with lead time, as it could be expected. Given the poor skill at lead times closer to 10 days, we decided to establish an upper limit on lead time to 7 days, the forecast horizon of DWD-ICON.

Figure 3. Evolution of skill with lead time.

The black line represents the notification criteria in EFASv4, while the colour lines depict the skill of the different probability thresholds for the Brier weighted method.

Total probability

Figure 1 proves that skill-weighted (or Brier weighted) is the best method to combine the NWPs into a grand ensemble and estimate the total exceedance probability. What does skill-weighted exactly mean?

We estimated the probabilistic skill of NWPs using the Brier score (BS), a squared error metric that compares the observed and predicted probabilities of exceeding a particular magnitude (in our case the EFAS 5-year return period):

\[ \text{BS} = \frac{1}{T}\sum_{t=1}^{T} \left( P_{obs,t} - P_{pred,t} \right)^2 \]

where, \( T \) is the number of time steps, \( P_{obs,t} \) is the observed probability of exceedance, and \( P_{pred,t} \) is the predicted probability of exceedance at a specific time step \( t \) . Brier scores were computed for every NWP and lead time using the historical archive of EFAS v4 forecasts and reanalysis. The resulting Brier scores were converted into weights by inverse distance weighting:

\[ w_{nwp,lt} = \frac{BS_{nwp,lt}^{-7}}{\sum_{i=1}^{4} BS_{i,lt}^{-7}} \]

where, \( w_{nwp,lt} \) is the weight assigned to a specific NWP and lead time, and \( BS_{nwp,lt} \) is the Brier score of that NWP and lead time; \( i \) takes values up to 4 as this is the amount of NWPs used in EFAS.

The figure below shows the distribution of weights among NWP models based on the Brier score that is used to compute the total probability in the new notification criteria. ECMWF-ENS proved to be the most skillful model, reason why the total exceedance probability relies mostly on this model, particularly as the lead time increases.

Figure 4. Distribution of weights over NWP models and lead time for the Brier weighted combination.

DWD stands for DWD-ICON, HRES for ECMWF-HRES, COS for COSMO-LEPS and ENS for ECMWF-ENS.

Page tree

Introduction

New EFAS Formal and Informal Notification Criteria

EFAS v5.2 - Updates to visualisations

EFASv4 skill assessment

Introduction

Methods

Results

Total probability

Further reading