Page History

Table of Contents

Using Verification Metrics alongside the Forecast

It is useful to have some measures of the current performance and biases of the IFS. Users can assess from Reliability and ROC diagrams whether the forecast model is:

...

ECMWF provides a number of verification metrics to use in this way, such as anomaly correlation coefficients, reliability diagrams and ROC curves, which have all been computed using the re-forecasts.

Brier Score

Brier Score (BS) is a measure, over a large sample, of the correspondence between each forecast probability against the frequency of occurrence of the verifying observations. On average, when rain is forecasted with probability p, it should occur with the same frequency p. Observation frequency is plotted against forecast probability as a graph. A perfect correspondence means the graph will lie upon the diagonal; the area between the graph and the diagonal measures the Brier Score. Values lie between 0 (perfect) and 1 (consistently wrong).

Distribution of forecast probabilities

The distribution of forecast probabilities gives an indication of the tendency of the forecast towards uncertainty. These are plotted as a histogram to give an indication of confidence in model performance:

...

However, there are few probabilities on the histogram between 0.2 and 0.9 which suggests that it would be unsafe to confidently draw similar deductions from the Reliability diagram within this probability range. Conversely, in Fig8.3.5-2 the majority of probabilities lie between 0.2 and 0.5 and reliability within this range appears fairly good while there is much less confidence in model performance for over- or under-forecasting an event. This is as expected as the forecast range becomes longer.

The Reliability diagram

The reliability diagram gives a measure of the capacity to discriminate between model over- or under-forecasting.

...

Current Reliability Diagrams (which include distribution of forecast probabilities) are available on Opencharts (days 4, 6, and 10 only)

The ROC diagram

The ROC diagram gives a measure of the capacity to discriminate when events are more likely to happen.

...

Current ROC Diagrams are available on Opencharts (for day5 onwards).

Image RemovedImage Added

Fig8.3.5-1: Reliability Diagram (left) and ROC diagram (right) regarding lower tercile for T2m in Europe area for week1 (day5-11), DT:20 Jun 2019.

Image RemovedImage Added

Fig8.3.5-2: Reliability Diagram (left) and ROC diagram (right) regarding lower tercile for T2m in Europe area for week5 (day19-32), DT:20 Jun 2019.

...

BrSc=Brier Score (BS), LCBrSkSc = Brier Skill Score (BSS).,
BS_REL = Forecast reliability and BS_RSL = Forecast resolution with respect to observations.
BSS_RSL = Forecast resolution and, BSS_REL = Forecast reliability with respect to climatology.

Image RemovedImage Added

Fig8.3.5-3: Example of Reliability Diagrams from Opencharts. Total 24hr precipitation Day6, assessed from ensemble probability forecasts during a three month period and compared climatology from the same period. The traces show the comparison of forecast probabilities against observed occurrences for 24h precipitation totals of >1mm, >5mm, >10mm, >20mm. Ideally the traces should lie along the dashed blue line (i.e. the ensemble probability forecast should agree with the observed frequency). The diagram shows:

reasonably good forecasting at low ensemble probabilities
- e.g. ensemble 20% probability occurred on 20% of the time for each group
over-forecasting at higher ensemble probabilities:
- e.g. ensemble 90% probability of >1mm/24h actually occurred only 60% of the time - the wide distribution of forecast probabilities suggest some confidence in the Reliability trace.
- e.g. ensemble 90% probability of >20mm/24h actually occurred 80% of the time - but the very few forecasts of high probabilities suggest very low confidence in the corresponding implied reliabilities.

Image RemovedImage Added

Fig8.3.5-4: Example of Reliability Diagrams from Opencharts. Temperature anomaly Day4, assessed from ensemble probability forecasts during a three month period and compared climatology from the same period. The traces show the comparison of forecast probabilities of anomalies against observed occurrences of anomalies for 2metre temperature of >8°C below, >4°C below, >4°C above, >8°C above climatology. Ideally the traces should lie along the dashed blue line (i.e. the ensemble forecast probability should agree with the observed frequency). The diagram shows:

...

- for >4°C above climatology ensemble 90% probability actually occurred 85% of the time - the wide distribution of forecast probabilities suggest some moderate confidence in the implied reliability.
- for >8°C below climatology ensemble 90% probability actually occurred 85% of the time - but the very few forecasts of high probabilities suggest very low confidence in the implied reliability

Image RemovedImage RemovedImage AddedImage Added

Fig8.3.5-5: Example reliability diagrams for 2m temperature based on July starts of the seasonal forecasts for months 4-6.

left for the tropics - a slight tendency towards over-confidence, more especially where forecasting that this event (warm anomalies) will happen.
right for Europe - a tendency towards over-confidence, though the sample size for high confidence forecasts is small, making the plot noisy.

Image Removed

Image RemovedImage AddedImage Added

Fig8.3.5-6: Example reliability diagrams for rain based on July starts of the seasonal forecasts for months 4-6:

left for the tropics - a tendency towards over-confidence.
right for Europe - forecast not reliable at all. Thus it should not be used, unless there are exceptional circumstances that warrant an expectation of skill that is ordinarily not there).

Image RemovedImage Removed

Image AddedImage Added

Fig8.3.5-7: Example ROC diagrams for Europe based on July starts of the seasonal forecasts for months 4-6:

the left diagram is for 2m temperatures in the upper tercile. The Hit Rate is slightly better than the False Alarm Rate indicating that the forecast system has some limited ability to discriminate occasions when warm events are likely from occasions when they are not.
the right diagram is for precipitation in the upper tercile. The Hit Rate and False Alarm Rate are similar throughout indicating that the seasonal forecast system has no ability to distinguish occasions when it will be wet from occasions when it will not.

Anomaly Correlation

Anomaly Correlation Coefficient (ACC) charts give an assessment of the skill of the forecast. They show the correlation at all geographical locations in map form.

...

Locations with correlation significantly (95% confidence level) different from zero are highlighted by dots.

Image RemovedImage Added

Fig8.3.5-8: Anomaly Correlation Coefficient for 2m temperature for months 2-4 based on November runs of the seasonal model. On the chart:

...

Space shortcuts

Page tree

Versions Compared

Old Version 1

New Version 2

Key

Using Verification Metrics alongside the Forecast

Brier Score

Distribution of forecast probabilities

The Reliability diagram

The ROC diagram

Anomaly Correlation