The EFAS v5.2 upgrade features a revision of the flood notification criteria for riverine flooding, i.e. Formal and Informal Notifications.
The EFAS modelling and forecasting chain has undergone several upgrades in the last years, but the notification criteria has been unchanged over the last decade. Therefore, a thorough statistical analysis was performed to identify the notification criteria that optimises the trade-off between correctly detected events, false alarms, and misses. This analysis was performed leveraging on EFASv4 simulations.
The main results are:
This page provides:
EFAS v5.2 formal flood notification criteria are:
Note: persistency is no longer used.
EFAS v5.2 informal flood notification criteria are:
Definition of the area of validity of an EFAS formal/informal notification: An EFAS notification is valid for the upstream and downstream river stretch of the same name that exceeds the 50% total probability threshold (40% for informal notifications) shown in the flood probability layer in the same country.
As a consequence of the new combination method and the new notification criteria, the following EFAS products have been changed:
The definition of the new notification criteria is based on an analysis of EFASv4 discharge simulations in the period from October 2020 to June 2023 at the 1979 EFAS fixed reporting points with a minimum catchment area of 500 km². For this period, we considered EFASv4 reanalysis (water balance) as the ground truth, and we used the EFASv4 forecasts to search for the optimal notification criteria. In particular, we looked for answers to the following questions:
The analysis consisted on computing the events that would have been notified for different combinations of the notification criteria, and evaluating the skill of each of those combinations by comparison against the flood events in the reanalysis (exceedances over the EFAS 5-year return period).
All possible combinations of the following criteria were tested:
As a measure of notification skill we used the f-score, a metric widely used for imbalanced classification tasks like the one at hand. The f-score is a combination of recall and precision, where recall is the ratio of actual events correctly identified, and precision is the ratio of notifications that are correct.
\text{recall} = \frac{\text{hits}}{\text{hits} + \text{misses}} |
\text{precision} = \frac{\text{hits}}{\text{hits} + \text{false alarms}} |
f_{\beta} = \left( 1 + \beta^2 \right) \frac{\text{precision} \cdot \text{recall}}{\beta^2 \cdot \text{precision} + \text{recall}} |
The beta coefficient in the f-score allows to assign higher importance to either precision (beta smaller than 1) or recall (beta larger than 1). In our analysis, we tried to minimize the amount of false alarms by using a value of 0.8.
Figure 1 summarizes in one image the results of the skill assessment. It exhibits for each combination of NWPs the evolution of the f-score depending on the probability threshold and the persistence criterion. As a reference, the black cross indicates the skill of the EFASv4 flood notification criteria.
Figure 1. Evolution of the notification skill with probability threshold and persistence for the different combinations of NWP.
Each plot represents a different combination of NWP. As a benchmark, the black cross represents the skill of the EFASv4 notification criteria.
This figure answers the first three questions posed at the beginning of this post:
There remains one of the original questions to be answered: whether the minimum catchment area limit can be reduced. Figure 2 shows that the notification skill increases with catchment area (as it was expected), but it does not worsen dramatically when moving from the current 2000 km² limit to 1000 km². Actually, the skill of the new notification criteria at 1000 km² is better than the previous criteria at 2000 km². We expect that the skill at smaller catchments of EFASv5 is better that EFASv4, given the increase in spatial resolution, but this is to be assessed when long enough EFASv5 forecasts will be available.
Figure 2. Evolution of skill with catchment area limit.
The black line represents the notification criteria in EFASv4, while the colour lines depict the skill of the optimized criteria for the 4 combination methods.
The vertical, dotted line indicates the 2000 km² catchment area limit in EFASv4.
A last change has been introduced in the new notification criteria: the maximum lead time of 7 days. We analysed the evolution of skill with increasing lead time (Figure 3) and saw that skill degrades with lead time, as it could be expected. Given the poor skill at lead times closer to 10 days, we decided to establish an upper limit on lead time to 7 days, the forecast horizon of DWD-ICON.
Figure 3. Evolution of skill with lead time.
The black line represents the notification criteria in EFASv4, while the colour lines depict the skill of the different probability thresholds for the Brier weighted method.
Figure 1 proves that skill-weighted (or Brier weighted) is the best method to combine the NWPs into a grand ensemble and estimate the total exceedance probability. What does skill-weighted exactly mean?
We estimated the probabilistic skill of NWPs using the Brier score (BS), a squared error metric that compares the observed and predicted probabilities of exceeding a particular magnitude (in our case the EFAS 5-year return period):
\text{BS} = \frac{1}{T}\sum_{t=1}^{T} \left( P_{obs,t} - P_{pred,t} \right)^2 |
where,
T |
is the number of time steps,
P_{obs,t} |
is the observed probability of exceedance, and
P_{pred,t} |
is the predicted probability of exceedance at a specific time step
t |
. Brier scores were computed for every NWP and lead time using the historical archive of EFAS v4 forecasts and reanalysis. The resulting Brier scores were converted into weights by inverse distance weighting:
w_{nwp,lt} = \frac{BS_{nwp,lt}^{-7}}{\sum_{i=1}^{4} BS_{i,lt}^{-7}} |
where,
w_{nwp,lt} |
is the weight assigned to a specific NWP and lead time, and
BS_{nwp,lt} |
is the Brier score of that NWP and lead time;
i |
takes values up to 4 as this is the amount of NWPs used in EFAS.
The figure below shows the distribution of weights among NWP models based on the Brier score that is used to compute the total probability in the new notification criteria. ECMWF-ENS proved to be the most skillful model, reason why the total exceedance probability relies mostly on this model, particularly as the lead time increases.
Figure 4. Distribution of weights over NWP models and lead time for the Brier weighted combination.
DWD stands for DWD-ICON, HRES for ECMWF-HRES, COS for COSMO-LEPS and ENS for ECMWF-ENS.
An assessment of the EFAS notification criteria. Presentation during the 18th EFAS Annual Meeting. September 2023, Offenbach (Germany).