Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

I have read that "ERA5 includes an uncertainty estimate that provides guidance on where products are more/less accurate."[1]  What does this mean? What exactly are uncertainties when using ERA5?

ERA5 is uses weather observations where such observations are available. On top of these observations ERA5 uses a weather forecasting model to produce a spatially and temporally continuous data. Like a weather forecast, the resulting data has contains some uncertainty.

ERA5 uncertainty estimation help understand the relative accuracy of the ERA5 system , i.e., to identify areas/periods where the products are thought to be less or more reliable, although the uncertainty values provided by the EDA system should not be taken at face value. The EDA system addresses uncertainties related to the observing system, sea surface temperature and the model (through its physical parametrizations).

...

No, don't take the uncertainty values at face value, though the EDA-based uncertainties are valuable to provide a relative estimate of uncertainties in terms of spatial and temporal distribution. In other words, the EDA can be used to get an idea of which areas and which periods ERA5 is more, or less, reliable.

Could you outline, in a nutshell, the strengths and weaknesses of the uncertainty estimate?

How is the uncertainty estimate obtained? Which sources of uncertainty does it account for and which does it omit?

The The main importance of the uncertainty estimation for ERA5 is that it provides added value to the ERA5 reanalysis product. This is based on physical considerations using an Ensemble of Data Assimilations (EDA) system. The EDA system addresses uncertainties in the ERA5 assimilation and modelling system, which is quantified by a 10-member ensemble. The EDA is able to indicate where ERA5 is more and where it is less accurate (for instance due to changes in observation coverage). The weakness is that the EDA does not account for all sources of uncertainty (such as systematic errors or correlated errors) and the EDA has lower spatial and temporal resolution than ERA5 itself. The latter means that it is not always easy to find a direct correspondence between the ERA5 reanalysis variables and the EDA uncertainty characteristics.

How is the uncertainty estimate obtained? Which sources of uncertainty does it account for and which does it omit?

The uncertainty estimation for ERA5 is obtained from the Ensemble of Data Assimilations (EDA) system. The EDA addresses some uncertainties of the model and data assimilation system, but not everything. The EDA accounts for uncertainties in observations, sea surface temperature (SST) and model physical parametrizations. Other uncertainties are not accounted for, such as uncertainties in radiative forcing due to greenhouse gases, or systematic errors in the model or the way in which observations are used.

How reliable is the ERA5 uncertainty estimate?

The reliability of the ensemble system can be measured using spread-skill (reliability) diagnostics. This measure describes how the spread of the ensemble can match the skill of the system. In the optimal case the ensemble spread should fully match the model skill, so the reliability diagram would be a diagonal line. The reliability of the EDA system is different for different variables, levels and reanalysis time periods. Generally speaking, it can be said that the EDA system is rather reliable (though generally under-dispersive, i.e. the spread is lower than the skill) and possesses information about the uncertainty of ERA5. A typical example of reliability diagnostics for surface pressure for the spring season for various reanalysis periods can be seen here.

Does the uncertainty also account for systematic errors in ERA5, or only for random errors?

obtained from the Ensemble of Data Assimilations (EDA) system. The EDA addresses some uncertainties of the model and data assimilation system, but not everything. The EDA accounts for uncertainties in observations, sea surface temperature (SST) and model physical parametrizations. Other uncertainties are not accounted for, such as uncertainties in radiative forcing due to greenhouse gases, or systematic errors in the model or the way in which observations are used.

How reliable is the ERA5 uncertainty estimate?

The reliability of the ensemble system can be measured using spread-skill (reliability) diagnostics. This measure describes how the spread of the ensemble can match the skill of the system. In the optimal case the ensemble spread should fully match the model skill, so the reliability diagram would be a diagonal line. The reliability of the EDA system is different for different variables, levels and reanalysis time periods. Generally speaking, it can be said that the EDA system is rather reliable (though generally under-dispersive, i.e. the spread is lower than the skill) and possesses information about the uncertainty of ERA5. A typical example of reliability diagnostics for surface pressure for the spring season for various reanalysis periods can be seen here:

Image Added

Does the uncertainty also account for systematic errors in ERA5, or only for random errors?

The uncertainty estimates MOSTLY account for random errors and NOT for systematic ones. The exceptions are the applied perturbations for sea surface temperature SST, that do incorporate estimates of systematic error. Only the random errors are accounted for in other observations and in the physical parametrisations of the model. Therefore, one limitation of the uncertainty estimation is that systematic errors are not well addressed.

Could you outline, in a nutshell, the strengths and weaknesses of the uncertainty estimate?

The main importance of the uncertainty estimation for ERA5 is that it provides added value to the ERA5 reanalysis product. This is based on physical considerations using an Ensemble of Data Assimilations (EDA) system. The EDA system addresses uncertainties in the ERA5 assimilation and modelling system, which is quantified by a 10-member ensemble. The EDA is able to indicate where ERA5 is more and where it is less accurate (for instance due to changes in observation coverage). The weakness is that the EDA does not account for all sources of uncertainty (such as systematic errors or correlated errors) and the EDA has lower spatial and temporal resolution than ERA5 itself. The latter means that it is not always easy to find a direct correspondence between the ERA5 reanalysis variables and the EDA uncertainty characteristicsThe uncertainty estimates MOSTLY account for random errors and NOT for systematic ones. The exceptions are the applied perturbations for sea surface temperature SST, that do incorporate estimates of systematic error. Only the random errors are accounted for in other observations and in the physical parametrisations of the model. Therefore, one limitation of the uncertainty estimation is that systematic errors are not well addressed.

Where can I find monthly-mean values for uncertainty?

...

Seasonal spread charts are computed for the beginning of each ERA5 decade and for the latest period too (they are available by request from the ERA5 team). These charts give an idea about the level of uncertainties for different seasons, regions, periods, levels and variables. We demonstrate this For instance, for the summer season of 1980. :

For

...

200 hPa zonal wind the largest uncertainties are in the tropical regions.

For 850 hPa temperature, the uncertainties are  generally larger in the Southern Hemisphere (this corresponds well with the fact that we have fewer observations in the Southern Hemisphere).

For MSLP the Antarctic region has the largest spread/uncertainty.

Image Added

Image Added

Image Added

/uncertainty. For all variables it is clear that the uncertainties are decreasing with time, i.e. the spread values are smaller for recent periods than for older ones.

...

One of the most important aspects that determines the ERA5 uncertainties is the amount and quality of available observations. The Global Observing System (GOS) has been evolving during the ERA5 period, which means that the observation amounts are generally increasing with time and as a result, uncertainties are decreasing. However there are some short periods, where there are fewer observations available. Typically, in the 1980s when the number of satellite observations was still quite low, there are some short periods, when missing observations cause an increase in the uncertainty i.e. an increase of the ensemble spread. The evolution of the mean spread for vorticity and temperature, for 3 different model levels, demonstrates this. :

vorticitytemperature

Image Added

Image Added


It can be seen that generally the spread (uncertainty) is steadily decreasing over time except for some jumps in the early periods. These jumps correspond to the blips in observation amounts. For instance, at the end of 1979 there were some shorter periods when the MSU and SSU instruments onboard the TIROS-N and NOAA-9 satellites were providing significantly fewer observations than normal. It is noted here that in the vast observing system of the present day, there is a degree of resilience which means that the assimilation system is much less sensitive to the failure of one instrument or satellite.

...

Indeed, the instantaneous spread fields might be  noisy at particular locations especially in the early reanalysis periods. For instance on 00 UTC 19800301 the MSLP spread is noisy over the Antarctic, the 850 hPa temperature spread is particularly noisy in the Southern Hemisphere and the 200 hPa zonal wind spread is noisy in the tropical region:

MSLP spread

850 hPa temperature spread200 hPa zonal wind spread
Image Added


Image Added

Image Added

. The main reason for this is the limited ensemble size of 10 members that introduces considerable sampling noise. On the other hand, if we consider the mean seasonal (JJA 1980 in this case) spread for the three variables (MSLP, 850 hPa temperature and 200 hPa zonal wind) this case) spread for the three variables the fields are much smoother and easier to interpret. For seasonal mean fields, this sampling noise is averaged out and as a result will provide smoother spread fields..:

MSLP (mean seasonal (JJA)

850 hPa temperature (mean seasonal (JJA)

200 hPa zonal wind (mean seasonal (JJA)

Image Added

Image Added

Image Added

When I look at active systems such as extra-tropical cyclones or tropical cyclones I expect a larger uncertainty, yet I do not see that clearly in the ensemble spread

...

The main problem with the extra-tropical and tropical cyclones in terms of uncertainty is the fact that due to the lower resolution of the EDA system, the EDA members systematically overestimate the central pressure of the cyclone (i.e. the pressure is not sufficiently low). This means that the spread among the members remains small and consequently the EDA shows lower uncertainties than in reality. On the other hand the spatial pattern of the uncertainties correspond rather well with the actual cyclones. This is demonstrated for some extra-tropical cyclones like cyclone Desmond: 2015120500, cyclone Xaver: 2013120500 or the Great Storm of 1987 in the UK. In all of these cases the maximum spread values don't exceed 1 hPa, which is quite small. The pattern of large spread values is scattered throughout the domain, though the primary cyclones are reasonably well-marked in the uncertainty field (particularly for the 1987 storm). For tropical cyclones the spread values can be larger, as it is for a cyclone near to Japan in 1987 or for the Haiyan typhoon near to the Philippines. It is very interesting to see the case of Hurricane Sandy, where the region with the largest uncertainties is not fully in agreement with the location of the hurricane's eye, but with some peaks to the east and west of it (the values are larger to the east). This indicates the uncertainties related to the position of the hurricane. So overall the EDA spread can give a qualitative idea of the uncertainties relating to active systems such as cyclones, but it is unable to provide the right uncertainty amplitude due to the lower resolution of the EDA system.

...

Mostly, the resolution of ERA5 is not sufficient to properly describe tropical cyclones. Additionally, the EDA system is of lower resolution than that of ERA5, which means further limitations for describing such small-scale phenomenon. For the above mentioned tropical cyclone case the lowest pressure of the cyclone is 969.7 hPa (see figure below), which is higher than the real observed pressure, but the cyclone itself is reasonably well described. The larger spread area corresponds well with the shape of the cyclone and the largest value is 2.7 hPa. This gives an indication about the relative uncertainty of the event, though the spread is presumably smaller than the real analysis error.

Image Added

Relevant  user questions (translated from French) from CUS-4133.

...