Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Table of Contents

Using Verification Metrics alongside the Forecast

It is useful to have some measures of the current performance and biases of the IFS.  Users can assess from Reliability and ROC diagrams whether the forecast model is:

...

ECMWF provides a number of verification metrics to use in this way, such as anomaly correlation coefficients, reliability diagrams and ROC curves, which have all been computed using the re-forecasts.


Brier Score

Brier Score (BS) is a measure, over a large sample, of the correspondence between the each forecast probability against the frequency of occurrence of the verifying observations (e.g. on average, when rain is forecasted with probability p, it should occur with the same frequency p).  Observation frequency is plotted against forecast probability as a graph.  A perfect correspondence means the graph will lie upon the diagonal; the area between the graph and the diagonal measures the Brier Score - values lie between 0 (perfect) and 1 (consistently wrong).

Distribution of forecast probabilities

The distribution of forecast probabilities gives an indication of the tendency of the forecast towards uncertainty.  These are plotted as a histogram to give an indication of confidence in model performance:

...

Note where there are only a few entries for a given probability on the histogram then confidence in the Reliability diagram is reduced for that probability.  Thus in Fig8.4-1 the predominance of probabilities below 0.2 and above 0.9 suggests there can be some confidence that when predicting lower tercile climatological temperatures at 2m, IFS tends to be over confident that the event will occur and under confident that it won't.  However, there are few probabilities on the histogram between 0.2 and 0.9 which suggests that it would be unsafe to confidently draw similar deductions from the Reliability diagram  within this probability range.  Conversely, in Fig8.4-2 the majority of probabilities lie between 0.2 and 0.5 and reliability within this range appears fairly good while there is much less confidence in model performance for over- or under-forecasting an event.  This is as expected as the forecast range becomes longer.

The Reliability diagram

The reliability diagram gives a measure of the capacity to discriminate between model over- or under-forecasting.

...

Current Reliability Diagrams (which include distribution of forecast probabilities) are available on Opencharts (days 4, 6, and 10 only)

The ROC diagram

The ROC diagram gives a measure of the capacity to discriminate when events are more likely to happen.  It shows the effectiveness of the IFS in forecasting an event that actually happens (Probability of Detection or Hit Rate) while balancing this against the undesirable cases of predicting an event that fails to occur (False Alarm Rate).  The effectiveness is also known as the 'resolution' of the forecasting system (not to be confused with spatial and temporal resolution).

...

  • left for temperatures in the upper tercile - the Hit Rate is slightly better than the False Alarm Rate indicating that the forecast system has some limited ability to discriminate occasions when warm events are likely from occasions when they are not. 
  • right for precipitation in the upper tercile - the Hit Rate and False Alarm Rate are similar throughout indicating that the seasonal forecast system has no ability to distinguish occasions when it will be wet from occasions when it will not.

Anomaly Correlation

Anomaly Correlation Coefficient (ACC) charts give an assessment of the skill of the forecast.  They show the correlation at all geographical locations in map form.

...