View Source

Forecast Error Growth

Relationship between Forecast Range and Forecast Error

Forecast error growth is, on average, largest at the beginning of the forecast. At longer forecast ranges it levels off asymptotically towards the error level of persistence forecasts, pure guesses or the difference between two randomly chosen atmospheric states (see Fig4.1.1). This error level is significantly higher than the average error level for a simple climatological average if used as a forecast. Forecast verification is discussed in the annexe.

Fig4.1.1: A schematic illustration of the forecast error development of a state-of-the-art NWP (full curve), persistence and guesses (dotted curve), whose errors converge to a higher error saturation level than modified forecasts, which converge at a lower RMSE level (dashed curve).

Relationship between Scale and Predictive Skill

It is known from theory and synoptic experience that the larger the scale of an atmospheric system, the longer is its timescale and the more predictable it normally is (see Fig4.1.2).

Fig4.1.2: A schematic illustration of the relationship between atmospheric scale and timescale. The typical predictability is currently approximately twice the timescale, but might ultimately be three times the timescale.

The typical predictability is currently approximately twice the timescale, but might ultimately be three times the timescale. Small baroclinic systems or fronts are currently well forecast to around Day2, cyclonic systems to around Day4 and the long planetary waves defining weather regimes to around Day8. As models improve over time these limits are expected to advance further ahead of the data time. Features that are coupled to the orography (e.g. lee-troughs), or to the underlying surface (e.g. heat lows), are rather less consistently well forecast. The predictable scales also show the largest consistency from one run to the next. Fig4.1.3 shows 1000hPa forecasts from six sequential runs of HRES verifying at the same time.

Fig4.1.3: A sequence of 1000hPa forecast charts ranging from T+156h to T+96h, all verify at 00UTC 20 August 2010. The forecast details differ between the forecasts but large-scale systems (a low near Ireland, a high over central Europe, a trough over the Baltic States) are common features. The T+144hr forecast from 14 August predicted a southwesterly gale over the British Isles six days later. It would have been unwise to make such a detailed interpretation of the forecast, considering the typical skill at that range. Only a statement of windy, unsettled and cyclonic conditions would have been justified. Such a cautious interpretation would have avoided any embarrassing forecast “jump”, when the subsequent T+132hr and T+120hr runs showed a weaker circulation. The same cautious approach would have minimized the forecast “jump” with the arrival of the T+108hr forecast.

Small-scale details can be filtered out to highlight the predictable scale. This does not necessarily have to be done subjectively. Retaining only the first 20 spectral components filters away all scales smaller than1000km and brings out the more predictable large-scale pattern (see Fig4.1.4).

Fig4.1.4: Same as Fig4.1.3 but based on just the 20 largest spectral components. Five of the six forecasts now show much larger coherence, with a cyclonic feature approaching the British Isles and a stationary high pressure system over central Europe.

A synoptic example of combining EM and probabilities

However, spectral filtering does not take into account how the predictability varies due its flow dependency; a small-scale feature near Portugal might be less predictable than an equally sized feature over Finland. To avoid over-interpreting the EM, in particular underestimating the risk of extreme weather events, it should preferably be presented together with a measure of the ensemble spread or event probabilities; these will convey an impression complementary to the EM. Since the EM and the probabilities relate naturally to each other, they should be presented together. So, for example, the EM of the MSLP (or 1000hPa) presented together with gale probabilities will put the latter into a synoptic context that will facilitate interpretation (Fig4.1.5).

Fig4.1.5: 1000 hPa forecast from 13 August 12 UTC +156 h to 16 August 00 UTC + 96 h, all valid at 20 August 00 UTC. Full lines are the 1000 hPa geopotential EM overlaid by the probabilities of wind speeds > 10 m/s. Probabilities are coloured in 20% intervals starting from 20%. Compare with Fig4.1.3 and Fig4.1.4.

Fig4.1.3 and Fig4.1.4 showed an example of how filtering less predictable synoptic scales can increase the accuracy and reduce the jumpiness of forecasts. This filtering is much better accomplished through the ensemble in Fig4.1.5. The ENS treats the same synoptic situation in a more consistent and optimal way as its flow dependency serves as a superior dynamic filter.

The T+12 h ensemble forecast is used here as an analysis proxy for the verification of the above forecasts (see Fig4.1.6).

Fig4.1.6: 1000 hPa EM 19 August 12 UTC +12 h valid at 20 August 00 UTC may serve as a proxy analysis for verification because of the small forecast range and the fact that the EM, thanks to the initial anti-symmetric nature of the perturbations, is almost identical to the CTRL. The probabilities essentially show where the verifying wind speed was > 10 m/s.

It can be seen from the above that some of the HRES medium-range forecasts in Fig4.1.3 (T+96hr, T+108hr and perhaps T+144hr) were quite good with respect to strong winds over Britain and Ireland but at the time the ENS indicated that gale force winds were not certain.

Model drift

In order to estimate and compensate for any model drift the model output is compared with the corresponding model climates (M-climate for ENS, ER-M-climate for Extended Range ENS, S-M-climate for Seasonal forecasting) for the current forecast date. This is derived using the same model construction as the ENS from a number of perturbed forecasts based on calendar dates surrounding the date of the current ENS run using historical data from several years. Systematic errors are then corrected during post-processing after the forecast is run.