In this Copernicus Knowledge Base article, a description of the differences between the final ERA5 back extension and the preliminary version is given. Although being a reliable dataset for many applications (Bell et al., 2021), the preliminary version suffers from unphysically intense tropical cyclones and has been made available as a separate dataset. This dataset includes catalogue entries for fast access to hourly and monthly fields on single and pressure levels and separate access to the ERA5-complete dataset on tape (for details see the ERA5 family page).
The revised and final release of the ERA5 back extension represents tropical cyclones more in line with the dataset from 1979 onwards. Note that (similarly to the preliminary product) this often means an underestimation of the strength of tropical cyclones in the final release, but (unlike the preliminary version) without frequent excessively strong cases. In addition, an inconsistency in the usage of surface-pressure bias correction has been resolved. Production streams were initialised from well spun up parts of the preliminary release which resulted in a more continuous transition between streams. This benefited, in particular, soil moisture and lower-stratospheric humidity. Also, an already known solution to the occurrence of occasional excessive surface winds was implemented, i.e., as for dates from February 2020, they also do not appear in the final ERA5 product before 1979.
The final back extension release was prepended to the ERA5 dataset from 1979 onwards, using the same catalogue entries for fast access on disk and the ERA5-complete mechanism for the full dataset in native resolution on tape.
So far, the final version has been released from 1959 onwards. A further extension from 1940 to 1958 is currently in production.
After the release of the latter, access to the preliminary dataset will be discontinued at some point. Users will be notified well in advance.
Consistent representation of tropical cyclones
In the preliminary back extension, too much weight was given to 6-hourly tropical cyclone best track pressure reports from the International Best Track Archive for Climate Stewardship (IBTrACS; Knapp et al., 2010). In addition, quality control had been switched off for such observations and were given full weight regardless of how much ERA5 short forecasts (i.e. the ERA5 'first guess') deviated from these observations prior to the assimilation. This is in contrast with the segment from 1979 onwards, where the weight given to outliers is reduced significantly (Huber norm, Tavolato and Isaksen, 2014, 2015). For the final back extension, this check was reinstated. In addition, the observation error of IBTrACS pressure observations was inflated from 0.78 to 2.0 hPa, which is more in line with what has been used in other reanalyses in the past. Furthermore, a lower limit of 910 hPa was introduced (i.e., for IBTrACS observations with lower pressure, this limit was used, instead) as an additional safe guard.
Figure 1: Density plots of IBTrACS observations versus the minimum reanalysis mean sea level pressure in the vicinity of these observations for (left column) ERA5 and (right column) ERA5 preliminary, over the period (top row) 1959-1978, (bottom right) 1950-1978 and (bottom left) 1979-2010.
The effect on the representation of tropical cyclones can be seen from Figure 1 which shows the IBTrACS mean sea level pressure observations versus the ERA5 analysis minimum pressure at IBTrACS observation time in the vicinity of these observations. From this it indeed follows that the final back extension (top left) does not really show cases where the analyzed pressure is obviously too low as it was for the preliminary back extension (right panels). In addition, the results are much more in line with the ERA5 product from 1979-2010 (bottom left panel). Although the latter segment of ERA5 did not assimilate IBTrACS observations (Hersbach et al., 2020), the presence of other in situ and abundant satellite observations should compensate for this and ensure that the representation of tropical cyclones is realistic.
The most extreme case for the final back extension appears on 26 September 1959 for Super Typhoon VERA, with minimum ERA5 pressures of 910 and 915 hPa at 06 and 00 UTC respectively. This was an exceptionally intense tropical cyclone that struck Japan, becoming the strongest and deadliest typhoon on record to make landfall in the country. These two points are clearly visible from the density plot in the top left panel of Figure 1. These are the only two synoptic times from 1959 onwards where the ERA5 wave height analysis exceeds 20m.
Figure 2: Maximum wave height for ERA5 (left) and ERA5 preliminary (right), over the period (top) 1959-1978 for 00, 06, 12 and 18 UTC, (bottom left) 1980-1999 and (bottom right) 1950-1978.
In the preliminary back extension, the unphysically extreme tropical cyclones generated unrealistically extreme winds and ocean waves. Again, this is much improved in the final back extension as displayed in the top-left panel of Figure 2 that shows the point-wise maximum wave height for all 0, 6, 12 and 18 UTC analysis fields from 1959-1978. Extreme waves are much lower than in the preliminary back extension (top right for 1959-1978, bottom right for 1950-1978), and are much more in line with ERA5 post 1978 (bottom left panel for the 20-year period from 1980-1999).
Considering global averages, sea level and waves have not changed much between the preliminary and the final version. Averaged over all 0/6/12/18-UTC fields between 1959-1978, global mean ocean winds are 0.02 m/s slower (from 7.64 to 7.62 m/s), significant wave height is 0.02 m lower (from 2.48 to 2.46 m) and the wave peak period is 0.04 s slower (from 8.36 to 8.32 s).
Reduced discontinuities between production streams
The final ERA5 back extension from 1959-1978 was conducted in 4 parallel streams of 5 years each. Initial conditions were taken from well spun-up streams that had been used to compose the preliminary back extension. Streams were started from the 15th of December of the year preceding the start of their consolidation period, e.g., stream 1 (see table 1) was started from a preliminary back extension stream, on 15 December 1973. The two-week period prior to 1 January allowed for the readjustment of the synoptic situation according to the new configuration.
Table 1: details of the 4 production streams for the final back extension, the start date of the streams they had been initialized from and the effective spin up.
|Stream||Start initial stream||Effective Spin up||Period|
|1||Jan 1964||10 years||1974-1978|
|2||Jan 1964||5 years||1969-1973|
|3||Jan 1957||7 years||1964-1968|
|4||Jan 1949||10 years||1959-1963|
As a result of this strategy, effective spin ups vary from 5-10 years which is much longer than the one-year spin up period for the preliminary back extension. This has a beneficial effects on the continuity between streams for slowly-varying components, such as lower-stratospheric humidity and soil moisture.
Figure 3: Monthly anomalies in upper-air specific humidity for (top) the preliminary back extension from 1950-1978 and ERA5 from 1979 onwards and (bottom) ERA5 throughout. The ERA5 segment prior to 1959 is currently in production and has not been released, so far. Anomalies are in percentages normalized per pressure level rather than in absolute values, which facilitates the visualization of the dryer stratosphere.
Compared to the preliminary back extension, the final ERA5 product exhibits smoother transitions of the mean state of stratospheric humidity (Figure 3) between streams. The discontinuity in 1979, however, remains, which is the result of an insufficient spin-up period of the stream that was consolidated from 1979.
Figure 4: Monthly-mean, volumetric soil moisture anomalies (%) relative to 1981–2020, for the second layer of the ERA5 land-surface model (7-28 cm), averaged over the continental areas defined in the caption of Figure 21 of Bell et al (2021), from 1950 to 2020, for ERA5 (thin lines) and the preliminary dataset (thick lines).
Figure 5: As Figure 4, but for the third layer (28-100 cm).
Figure 6: As Figure 4, but for the fourth layer (100-289 cm).
The effect on soil moisture is displayed in Figures 4 to 6, which show the monthly-mean, volumetric soil moisture anomalies (%) relative to 1981–2020, for the ERA5 land-surface model layers 2, 3 and 4, respectively. The jumps at the interface between production streams is smaller for the ERA5 final back extension (thin lines) than for the preliminary product (thick lines). This is especially the case for level 2 and 3, but less so for level 4 over Australia.
Consistent bias correction of surface pressure observations
As described in Bell et al., 2021, the preliminary back extension suffered from a bug in the variational bias correction scheme for surface pressure. As a result, the scheme only worked properly for part of the preliminary back extension. It wasn't active at all prior to April 28 1953, while for July 1, 1959 to May 13, 1961, March 1, 1965 to March 18, 1968 and January 1, 1974 to June 19, 1974, only part of the observational data had been bias corrected. This can be seen from the blue curve in the top panel of Figure 7 which shows the STDV of the bias corrections applied to all actively used surface and mean sea level observations. This was resolved for the final back extension (red curve). In addition, it has been ensured that each stream of the final version has been started (details in Table 1 above) from a date of the initializing preliminary back extension stream where the scheme had been working properly for a sufficient amount of time.
Figure 7: daily- (light colours) and monthly-averaged (dark colours) STDV for all used surface and mean sea level pressure observations over the globe for ERA5 preliminary (blue) and ERA5 (red) back extension for (top) the applied bias correction and the (middle) first-guess and (bottom) analysis, minus bias-corrected observation, respectively.
The figure clearly shows a more continuous bias correction over the entire period and the departure statistics (middle and lower panels) have improved significantly for the periods where the bug was active for the preliminary product. In addition, for these periods the corrected bias scheme leads to slightly higher numbers of active data, which is also a good sign.
Figure 8: statistics for 1960 per 1x1-degree grid box over active surface and mean sea level pressure observations for ERA5 (left) and ERA5 preliminary back extension (right) for the average data count per 12h analysis cycle (top), and mean bias estimate (bottom). Numbers in the bottom line of the headers are based on area-averaged, grid-point-wise values, i.e., they are not weighted with respect to grid-point-wise data counts (top panel).
Figure 8 shows statistics for 1960; a year for which the bug in the bias-correction scheme had been present in the preliminary product. The top panels show the geographical distribution of observations, with the most over the US, Europe and East Asia. The final back extension (left) used about 1.8% more observations than the preliminary product. The likely reason for this is that a more adaptive bias correction gives rise to slightly fewer rejections. The location-wise mean bias is displayed in the lower panels of Figure 8. The final product exhibits stronger magnitudes of the bias. In particular, the mean positive bias over the central Eurasian continent is note worthy, as is the relatively constant positive bias for ships over the Atlantic and Indian Oceans. In principle, one would expect a bias map to show more white noise, since each station has its own (random and uncorrelated) systematic bias. However, this is not what is seen (and not for bias estimate maps for more recent times, either; not displayed here). The systematic patterns can indicate systematic biases in the observing system, but also in the ERA5 model. In the latter case, such model bias is inadvertently aliased into observation bias.
Changes in 2 metre temperature and precipitation for specific regions
In Simmons et al., 2021 a detailed study was made of the low frequency variability and trends in surface air temperature and humidity from the preliminary back extension of ERA5 and from other reanalyses and more direct observational datasets. Although good comparison was found over a number of regions (and averaged over the globe), some systematic unexplained differences were observed as well. Most concerning were too-high temperatures over Australia prior to 1970 and spells of high temperature over parts of Africa and South America that were associated with abnormally low rainfall. The latter was evident in time series comparing continental averages of rainfall from ERA5 with corresponding values from the Global Precipitation Climatology Centre, as reported by Bell et al., 2021.
Figure 9 shows the 12-month running mean 2m temperature anomaly for several products over South America (top), Africa (middle) and Australia (bottom panel). Over Africa, the discrepancy between the preliminary back extension and the more direct observational product GISTEMP is improved significantly in 1965 and 1966, in the final version. Some improvement is also seen over South America, but there is only marginal improvement over Australia. The same is found for precipitation (Figure 10). These improvements may be related to a more balanced soil-moisture product as displayed in Figures 4-6.
The ERA5 global mean 2m temperature (not displayed) has changed little in the final version compared to the preliminary, though the comparison with GISTEMP has improved slightly between 1959-1961 and 1965-1966, but was marginally worse for 1974-1976.
Figure 9: Twelve-month running-mean surface temperature anomaly (K) relative to 1981–2010, from 1950 to 2020 for ERA5 (red), ERA5 preliminary (orange) and GISTEMP (blue, Hansen et al., 2010, Lenssen et al., 2019) for (top) South America, (middle) Africa and (bottom) Australia. For the latter, a comparison is also made with the ACORN-SAT (dashed blue, Trewin, 2013) dataset.
Figure 10: Twelve-month running-mean precipitation (mm/day) for ERA5 (red), ERA5 preliminary (orange) and GPCC (blue, Becker et al.,, 2013) for (top) South America, (middle) Africa and (bottom) Australia.
Becker, A., Finger, P., Meyer-Christoer, A., Rudolf, B., Schamm, K., Schneider, U. and Ziese, M. (2013) A description of the global land-surface precipitation data products of the Global Precipitation Climatology Centre with sample applications including centennial (trend) analysis from 1901–present. Earth System Science Data, 5, 71–99. https://essd.copernicus.org/articles/5/71/2013/essd-5-71-2013.html
Bell, B., Hersbach, H., Simmons, A., Berrisford, P., Dahlgren, P., Horányi, A., Muñoz‐Sabater, J., Nicolas, J., Radu, R., Schepers, D. and Soci, C., 2021. The ERA5 global reanalysis: Preliminary extension to 1950. Quarterly Journal of the Royal Meteorological Society, 147(741), pp.4186-4227, https://doi.org/10.1002/qj.4174.
Hansen J., R. Ruedy, M. Sato and K. Lo, 2010: Global surface temperature change. Rev. Geophys., 48, RG4004, doi:10.1029/2010RG000345.
Hersbach, Hans, Bill Bell, Paul Berrisford, Shoji Hirahara, András Horányi, Joaquín Muñoz‐Sabater, Julien Nicolas et al., "The ERA5 global reanalysis." Quarterly Journal of the Royal Meteorological Society 146, no. 730 (2020): 1999-2049, https://doi.org/10.1002/qj.3803.
Lenssen, N., G. Schmidt, J. Hansen, M. Menne, A. Persin, R. Ruedy, and D. Zyss, 2019: Improvements in the GISTEMP uncertainty model. J. Geophys. Res. Atmos., 124, 6307-6326, doi:10.1029/2018JD029522.
Knapp, K. R., M. C. Kruk, D. H. Levinson, H. J. Diamond, and C. J. Neumann, 2010: The International Best Track Archive for Climate Stewardship (IBTrACS). Bull. Am. Meteorol. Soc., 91, 363–376, https://doi.org/10.1175/2009BAMS2755.1.
Simmons et al., "Low frequency variability and trends in surface air temperature and humidity from ERA5 and other datasets". ECMWF Technical Memorandum, Number 881, 2021.
Tavolato C, Isaksen L: "On the use of a Huber norm for observation quality control in the ECMWF 4D-Var". ECMWF Technical Memorandum, Number 744, 2014.
Tavolato C, Isaksen L: "On the use of a Huber norm for observation quality control in the ECMWF 4D‐Var". Quarterly Journal of the Royal Meteorological Society. 2015 Jul;141(690):1514-27, https://doi.org/10.1002/qj.2440.
Trewin, B., 2013: A daily homogenized temperature data set for Australia. International Journal of Climatology, 33, 1510–1529, doi:10.1002/joc.3530. See also the technical report on version 2, available at http://www.bom.gov.au/climate/data/acorn-sat/.