The EFAS v5.2 upgrade features a revision of the flood notification criteria based on an analysis done on the EFASv4 simulations. The main result of the analysis is a new methodology to combine the EFAS forecasts forced by the 4 different Numerical Weather Prediction (NWP) models into a grand ensemble.
Detailed description of the new notification criteria.
The new formal flood notification criteria are:
An informal flood notification will be issued when any of the criteria above is not met, but the forecaster deems that the authorities should be informed. To issue an informal notification, the exceedance probability must be at least 40% and the catchment area at least 500 km².
As a consequence of the new combination method and the new notification criteria, the following EFAS products have been changed:
The definition of the new notification criteria is based on an analysis of EFASv4 discharge simulations in the period from October 2020 to June 2023 at the 1979 EFAS fixed reporting points with a minimum catchment area of 500 km². For this period, we considered EFASv4 reanalysis (water balance) as the ground truth, and we used the EFASv4 forecasts to search for the optimal notification criteria. In particular, we looked for answers to the following questions:
The analysis consisted on computing what events would have been notified for different combinations of the notification criteria, and evaluating the skill of each of those combinations by comparison against the flood events in the reanalysis (exceedances over the EFAS 5-year return period).
All possible combinations of the following criteria were tested:
As a measure of notification skill we used the f-score, a metric widely used in machine learning for unbalanced classification tasks like the one at hand. The f-score is a combination of recall and precision, where recall is the ratio of actual events correctly identified, and precision is the ratio of notifications that are correct.
\text{recall} = \frac{\text{hits}}{\text{hits} + \text{misses}} |
\text{precision} = \frac{\text{hits}}{\text{hits} + \text{false alarms}} |
f_{\beta} = \left( 1 + \beta^2 \right) \frac{\text{precision} \cdot \text{recall}}{\beta^2 \cdot \text{precision} + \text{recall}} |
The beta coefficient in the f-score allows to assign higher importance to either precision (beta smaller than 1) or recall (beta larger than 1). In our analysis, we tried to minimize the amount of false alarms by using a value of 0.8.
Figure 1 summarizes in one image the results of the skill assessment. It exhibits for each combination of NWPs the evolution of the f-score depending on the probability threshold and the persistence criterion. As a reference, the black cross indicates the skill of the EFASv4 flood notification criteria.
Figure 1. Evolution of the notification skill with probability threshold and persistence for the different combinations of NWP.
Each plot represents a different combination of NWP. As a benchmark, the black cross represents the skill of the EFASv4 notification criteria.
This figure answers the first three questions posed at the beginning of this post:
There remains one of the original questions to be answered: whether the minimum catchment area limit can be reduced. Figure 2 shows that the notification skill increases with catchment area (as it was expected), but it does not worsen dramatically when moving from the current 2000 km² limit to 1000 km². Actually, the skill of the new notification criteria at 1000 km² is better than the previous criteria at 2000 km². We expect that the skill at smaller catchments of EFASv5 is better that EFASv4, given the increase in spatial resolution, but this is to be assessed when long enough EFASv5 forecasts will be available.
Figure 2. Evolution of skill with catchment area limit.
The black line represents the notification criteria in EFASv4, while the colour lines depict the skill of the optimized criteria for the 4 combination methods.
The vertical, dotted line indicates the 2000 km² catchment area limit in EFASv4.
A last change has been introduced in the new notification criteria: the maximum lead time of 7 days. We analysed the evolution of skill with increasing lead time (Figure 3) and saw that skill degrades with lead time, as it could be expected. Given the poor skill at lead times closer to 10 days, we decided to establish an upper limit on lead time to 7 days, the forecast horizon of DWD-ICON.
Figure 3. Evolution of skill with lead time.
The black line represents the notification criteria in EFASv4, while the colour lines depict the skill of the different probability thresholds and the Brier weighted method.
Figure 1 proves that skill-weighted (or Brier weighted) is the best method to combine the NWPs into a grand ensemble and estimate the total exceedance probability. What does skill-weighted exactly mean?
We estimated the probabilistic skill of NWPs using the Brier score (BS), an squared error metric that compares the observed and predicted probabilities of exceeding a particular magnitude (in our case the EFAS 5-year return period):
\text{BS} = \frac{1}{T}\sum_{t=1}^{T} \left( P_{obs,t} - P_{pred,t} \right)^2 |
where,
T |
is the number of time steps,
P_{obs,t} |
is the observed probability of exceedance, and
P_{pred,t} |
is the predicted probability of exceedance at a specific time step
t |
. Brier scores were computed for every NWP and lead time using the historical archive of EFAS v4 forecasts and reanalysis. The resulting Brier scores were converted into weights by inverse distance weighting:
w_{nwp,lt} = \frac{BS_{nwp,lt}^{-p}}{\sum_{i=1}^{4} BS_{i,lt}^{-7}} |
where,
w_{nwp,lt} |
is the weight assigned to a specific NWP and lead time, and
BS_{nwp,lt} |
is the Brier score of that NWP at that lead time.
The figure below shows the distribution of weights among NWP models based on the Brier score that is used to compute the total probability in the new notification criteria. ECMWF-ENS proved to be the most skillful model, reason why the total exceedance probability relies mostly on this model, particularly as the lead time increases.
Figure 4. Distribution of weights over NWP models and lead time for the Brier weighted combination.
DWD stands for DWD-ICON, HRES for ECMWF-HRES, COS for COSMO-LEPS and ENS for ECMWF-ENS.
An assessment of the EFAS notification criteria. Presentation during the 18th EFAS Annual Meeting. September 2023, Offenbach (Germany).