Contributors: Marco Cucchi (B-Open), Alessandro Amici (B-Open), Graham Weedon (Met Office), Nicolas Bellouin (University of Reading), Stefan Lange (PIK), Hannes Müller Schmied (SBIK-F), Hans Hersbach (ECMWF), Carlo Buontempo (ECMWF), Chiara Cagnazzo (ECMWF)
Issued by: B-Open / Marco Cucchi
Issued Date: 15/04/2021
Ref: C3S_322_Lot1.4.1.3_CERRA_data_user_guide – version 1
Official reference number service contract: 2017/C3S_322_Lot1_SMHI/SC2
Executive summary
The present dataset, also known as WATCH Forcing Data methodology applied to ERA5 (WFDE5), is a meteorological forcing dataset for land surface and hydrological models. It consists of eleven variables (see Table 1) with an hourly temporal resolution on a regular longitude-latitude half-degree grid, with global spatial coverage and values defined only for land and lake points. The dataset was derived applying sequential elevation and monthly bias correction methods described in [1] and [2], and briefly outlined in Section 2, to half-degree aggregated ERA5 reanalysis products [3].
The monthly observational datasets used for bias correction are CRU TS [4] for all variables and the GPCC full data monthly product [5] [6] for precipitation variables only, with specific versions detailed in Table 2. As a result, two different datasets for each precipitation variable (Rainf and Snowf) are obtained: one corrected using only the CRU TS dataset and one corrected using both the CRU TS and GPCC datasets. Other input data used for the generation of WFDE5 dataset are: a) catch corrections for rain- and snow-precipitation gauges and b) shortwave downward radiation corrections due to atmospheric aerosol loading effects. See Table 3 for a summary of all the input datasets used.
WFDE5 dataset is distributed through the C3S Climate Data Store as monthly files in netCDF format. It uses a full half-degree grid (720 × 360 grid boxes) with sea and large lakes grid-points flagged as missing data, comprising a total of 92889 land points (Antarctica included). General dataset attributes are described in Table 4.
A detailed description of the dataset can be found in [7].
Table 1: WFDE5 variables summary
Variable name | Description | Units | Time coverage (v1.0, v1.1) | Time coverage (v2.0) |
Wind | Near-surface wind speed | m s-1 | 1979-01-01 00:00:00 to 2018-12-31 23:00:00 | 1979-01-01 00:00:00 to 2019-12-31 23:00:00 |
Tair | Near-surface air temperature | K | 1979-01-01 00:00:00 to 2018-12-31 23:00:00 | 1979-01-01 00:00:00 to 2019-12-31 23:00:00 |
PSurf | Surface air pressure | Pa | 1979-01-01 00:00:00 to 2018-12-31 23:00:00 | 1979-01-01 00:00:00 to 2019-12-31 23:00:00 |
Qair | Near-surface specific humidity | kg kg-1 | 1979-01-01 00:00:00 to 2018-12-31 23:00:00 | 1979-01-01 00:00:00 to 2019-12-31 23:00:00 |
LWdown | Surface downwelling longwave radiation | W m-2 | 1979-01-01 07:00:00 to 2018-12-31 23:00:00 | 1979-01-01 07:00:00 to 2019-12-31 23:00:00 |
SW down | Surface downwelling shortwave radiation | W m-2 | 1979-01-01 07:00:00 to 2018-12-31 23:00:00 | 1979-01-01 07:00:00 to 2019-12-31 23:00:00 |
Rainf (CRU) | Rainfall flux (corrected using CRU TS dataset) | kg m-2 s-1 | 1979-01-01 07:00:00 to 2018-12-31 23:00:00 | 1979-01-01 07:00:00 to 2019-12-31 23:00:00 |
Snowf (CRU) | Snowfall flux (corrected using CRU TS dataset) | kg m-2 s-1 | 1979-01-01 07:00:00 to 2018-12-31 23:00:00 | 1979-01-01 07:00:00 to 2019-12-31 23:00:00 |
Rainf (CRU+GPCC) | Rainfall flux (corrected using CRU TS and GPCC datasets) | kg m-2 s-1 | 1979-01-01 07:00:00 to 2016-12-31 23:00:00 | 1979-01-01 07:00:00 to 2019-12-31 23:00:00 |
Snowf (CRU+GPCC) | Snowfall flux (corrected using CRU TS and GPCC datasets) | kg m-2 s-1 | 1979-01-01 07:00:00 to 2016-12-31 23:00:00 | 1979-01-01 07:00:00 to 2019-12-31 23:00:00 |
ASurf | Gird-point altitude | m | - | - |
Table 2: WFDE5 versions history
Version | Observational datasets used | Changes from previous version |
v1.0 | · CRU TS 4.03 (all variables) · GPCCv2018 (Rainf and Snowf) | - |
v1.1 | · CRU TS 4.03 (all variables) · GPCCv2018 (Rainf and Snowf) | · Fix SWdown values, affected in v1.0 by a bug compromising correct computation |
v2.0 | · CRU TS 4.04 (all variables) · GPCCv2020 (Rainf and Snowf) | · Use latest versions of the observational datasets · Extend time coverage to 2019 · Update diurnal temperature range correction algorithm |
v2.1 | · CRU TS 4.04 (all variables) · GPCCv2020 (Rainf and Snowf) | · Fix issue affecting Qair and LWdown values (see "Known issues" section) |
Table 3: Input datasets
Dataset | Summary | Variables used |
ERA5 | ECMWF reanalysis product | · 10 m u-component of wind · 10 m v-component of wind · 2 m temperature · Surface pressure · 2 m dewpoint temperature · Surface thermal radiation downwards · Surface solar radiation downwards · Cloud cover · Total precipitation · Snowfall · Land-sea mask · Lake cover mask · Orography |
CRU TS 4.03, CRU TS 4.04 | Climate Research Unit gridded station observations (multiple variables) | · Temperature · Diurnal temperature range · Cloud cover · Wet days · Precipitation · Grid-points altitude |
GPCCv2018, GPCCv2020 | Global Precipitation Climatology Centre gridded station precipitation observations | · Precipitation |
Other | - | · Rainfall and snowfall gauge corrections 1 · Aerosol loading corrections to shortwave downward radiation 2 |
Algorithms Description
The algorithms described here correspond to those of [1] and [2] developed for the WATCH forcing data. The WFDE5 dataset is computed using a series of CDS Toolbox workflows which take the input variables shown in Table 3 and automatically apply the following key processing steps:
- ERA5 reanalysis aggregation from the quarter- to the CRU half-degree grid;
- Sequential “elevation correction” of Tair, PSurf, Qair, and LWdown to account for differences in surface heights between quarter- and half-degree grids and to ensure consistency between the different corrected variables. Tair is bias-adjusted to observed monthly averaged near-surface air temperature and observed monthly averaged diurnal temperature range before the processing of PSurf;
- Adjustment of SWdown, Rainf and Snowf at the monthly scale via CRU TS and, for precipitation variables only, GPCC observational datasets.
Table 4: WFDE5 dataset general attributes
Dataset attribute | Details |
Horizontal coverage | Global |
Horizontal resolution | 0.5° x 0.5° |
Vertical coverage | Surface |
Temporal coverage | Depends on variable (see Table 3) |
Temporal resolution | Hourly |
File format | netCDF |
Data type | Grid |
Versions | v1.0, v1.1, v2.0, v2.1 |
Aggregation to the half-degree grid is applied on sea-level adjusted values of the different ERA5 variables, which are obtained as part of the elevation-correction procedure.
Elevation-correction procedures are physically-based sequential adjustments applied to account for differences in surface heights between input and output grid and ensure consistency among the different variables. The variables to which elevation-correction is applied are i) Tair, ii) PSurf, iii) Qair and iv) SWdown, which have to be processed in the cited order: each corrected variable is indeed needed to process the following one.
Further adjustments based on the monthly observational datasets listed in Table 2 and Table 3 are applied to variables Tair, SWdown, Rainf and Snowf.
Two corrections are applied to Tair values, both driven by differences between pre-processed ERA5 values and CRU monthly observations: first, a monthly scale correction based on the differences between temperature values is applied; then, an hourly scale correction taking into account differences in diurnal temperature ranges (DTR) leads to the final values. It is worth noticing that, before applying the aforementioned processing steps, CRU DTR values are corrected for known anomalies over a few specific grid boxes. Also, to avoid the occurrence of anomalously high Tair values noticed in WFDE5v1.0 and v1.1, the hourly scale correction algorithm based on DTR has been update in WFDE5v2.0 with respect to the one detailed in [1] and [2], and it is briefly described in Appendix A.
A further modification to the original algorithm detailed in [1] and [2] has been applied to generate WFDE5v2.1, in order to fix the issue affecting Qair (and, with a minor impact, LWdown) values described in the "Known issues" section. The implemented change is described in Appendix B.
SWdown values are first adjusted to be consistent with CRU observed cloud-cover fractions: this is done using local linear relationship between anomalies in monthly short-wave radiation and cloud cover in aggregated ERA5 data along with CRU cloud cover anomalies to reconstruct the associated short-wave radiation anomalies. Then, corrections due to changes in direct and indirect effects of atmospheric aerosol loading on surface short-wave radiation fluxes are applied.
Finally, the following corrections are applied to Rainf and Snowf values: a) adjustment of precipitation fluxes values to ensure matching of the monthly number of wet days between interpolated ERA5 and CRU data; b) correction to monthly precipitation totals to match CRU/GPCC values; c) correction to rainfall and snowfall fluxes to account for catch corrections of precipitation gauges [8]; d) precipitation phase (snowfall/rainfall) correction when adjusted Tair lays beyond thresholds derived from ERA5 temperature extremes at fixed precipitation phase within each grid box and calendar month.
For all meteorological variables, values over Antarctica are obtained simply via aggregation and elevation-correction, given the absence of observations over that area.
ASurf values, corresponding to WFDE5 dataset's grid-points altitude, are obtained applying ERA5 sea- land and lake cover mask to CRU grid-points heights: in this way, only those which are identified as land or lake points by both ERA5 and CRU datasets are retained, resulting in a total of 92889 grid- points.
The CDS Toolbox workflows used for the generation of the WFDE5 dataset are available for download in the "Documentation" tab of the dataset's CDS catalogue entry. It is worth mentioning that these workflows rely on intermediate products, already present on a CDS virtual machine, which have been computed once through different CDS Toolbox workflows: these, for the sake of simplicity, have not been included in the list of downloadable scripts, but the computations they perform are described in [7], [1] and [2].
Data Description
File naming convention
File names adhere to the following convention:
<var>_WFDE5_<reference_dataset>_<YYYYMM>_v<version>.nc,
where
<var>: variable name, as in Table 1
- <reference_dataset>: one between CRU (all variables) and CRU+GPCC (Rainf and Snowf only)
- <YYYYMM>: year and month
- <version>: WFDE5 dataset version
As an example, possible file names are:
- Tair_WFDE5_CRU_201801_v1.0.nc
- Rainf_WFDE5_CRU_201801_v1.0.nc
- Rain_WFDE5_CRU+GPCC_201801_v1.0.nc
File content
Actual file content and dimensions size varies from month to month and depending on the variable, but the general file structure is constant and analogous to the following example:
netcdf Tair_WFDE5_CRU_201801_v1.0 { dimensions: lon = 720 ; lat = 360 ; time = 744 ; variables: double lon(lon) ; lon:_FillValue = NaN ; lon:standard_name = "longitude" ; lon:units = "degrees_east" ; lon:axis = "X" ; lon:long_name = "Longitude" ; double lat(lat) ; lat:_FillValue = NaN ; lat:standard_name = "latitude" ; lat:units = "degrees_north" ; lat:axis = "Y" ; lat:long_name = "Latitude" ; double time(time) ; time:_FillValue = NaN ; time:standard_name = "time" ; time:long_name = "Time" ; time:axis = "T" ; time:units = "hours since 1900-01-01" ; time:calendar = "gregorian" ; float Tair(time, lat, lon) ; Tair:_FillValue = 1.e+20f ; Tair:long_name = "Near-Surface Air Temperature" ; Tair:standard_name = "air_temperature" ; Tair:units = "K" ; // global attributes: :title = "WATCH Forcing Data methodology applied to ERA5 data" ; :institution = "Copernicus Climate Change Service" ; :contact = "http://copernicus-support.ecmwf.int" ; :summary = "ERA5 data regridded to half degree regular lat-lon; Genuine land points from CRU grid and ERA5 land-sea mask only; Tair elevation & bias-corrected using CRU TS4.03 mean monthly temperature and mean diurnal temperature range" ; :Conventions = "CF-1.7" ; :comment = "Methodology implementation for ERA5 and dataset production by B-Open Solutions for the Copernicus Climate Change Service in the context of contract C3S_25c" ; :reference = "Weedon et al. 2014 Water Resources Res. 50, 7505-7514, doi:10.1002/2014WR015638; Harris et al. 2014 Int. J. Climatol. 34, 623-642, doi:10.1002/joc.3711; Cucchi et al. 2020 Earth Syst. Sci. Data Discuss., doi:10.5194/essd-2020-28, 2020" ; :licence = "The dataset is distributed under the Licence to Use Copernicus Products. The corrections applied are based upon CRU TS4.03, distributed under the Open Database License (ODbL)" ; }
Known issues
Licence and Citation
The described dataset is distributed under the Licence to Use Copernicus Products. The corrections applied are based upon CRU TS 4.03, distributed under the Open Database License (ODbL), CRU TS 4.04, distributed under the Open Government Licence (OGL), and GPCCv2018/v2020, distributed under the Creative Commons Attribution 4.0 International Licence (CC BY 4.0).
If publishing using this dataset please cite [7], where a more detailed description of the dataset can be found.
Appendix A – WFDE5v2.0: DTR-based correction algorithm update
As mentioned in section 2, the Tair correction algorithm based on CRU DTR has been updated in WFDE5v2.0 with respect to the one used for WFDE5v1.0 and WFDE5v1.1. While details on the latter can be found in [1] and [2], a brief description of the new algorithm is reported below:
- For each month, interpolated ERA5 hourly near-surface air temperatures are used to compute the ERA5 interpolated monthly average diurnal temperature range (DTRE5,mm);
DTR correction factors A and R are computed in the following way:
\[ A = DTR_{CRU} - DTR_{E5,mm} \] \[ R = \frac{DTR_{CRU}}{DRT_{E5,mm}} \]ERA5 interpolated diurnal temperature range values are corrected (DTRE5,corr), depending on the value of R at each grid-point, in the following way:
\[ DTR_{E5,corr} = \begin{cases} DTR_{E5} * R & R \le 1 \\ DTR_{E5} + A & R < 1 \\ \end{cases} \]A multiplicative correction factor F is computed as:
\[ F = \frac{DTR_{E5,corr}}{DTR_{E5}} \]After applying the correction for monthly mean biases, deviations of corrected hourly ERA5 near-surface air temperature values (TairE5,corr) from their respective daily means (TairE5,corr,dm) are multiplied by F , and results are added to ERA5 near-surface air temperature daily means (TairE5,dm) to obtain final corrected hourly WFDE5 near-surface air temperature values (TairW5):
\[ Tair_{W5} = Tair_{E5,corr,dm} + (Tair_{E5,corr} - Tair_{E5,corr,dm}) \ast F \]
Appendix B – WFDE5v2.1: Qair derivation algorithm update
As mentioned in section 2, the Qair derivation algorithm has been updated in WFDE5v2.1 with respect to the one used for previous versions of the dataset. While details on the latter can be found in [1] and [2], a brief description of the new algorithm is reported below.
1) Computing ERA5 implied relative humidity
The first step needed for the derivation of bias corrected specific humidity at surface (Qair) is the derivation of ERA5 relative humidity at surface (RHE5), which is not available from the Climate Data Store (CDS) catalogue. To do so, the following steps are carried out.
Step 1
Compute ERA5 saturated vapour pressure (SVPE5) and vapour pressure (VPE5) using ERA5 surface pressure (PSurfE5) and, respectively, ERA5 surface air temperature (TairE5) and dewpoint temperature (TdewE5), with the following formulas [9]:
with enhancement factors Fw and Fi computed as:
where:
- TairE5: ERA5 surface air temperature (C)
- TdewE5: ERA5 surface dewpoint temperature (C)
- PSurfE5: ERA5 surface pressure (hPa)
- EAw = 6.1121
- EBw = 18.729
- ECw = 257.87
- EDw = 227.3
- EAi = 6.1115
- EBi = 23.036
- ECi = 279.82
- EDi = 333.7
- FAw = 1.00072
- FBw = 3.2 * 10-6
- FCw = 5.9 * 10-10
- FAi = 1.00022
- FBi = 3.83 * 10-6
- FCi = 6.4 * 10-10
- For formulas involving TairE5 only, constants with subscript “w” or “i” are used when respectively TairE5 >= 0 and TairE5 < 0
NOTE: These corresponds to formulas 4a (with sets of coefficients ew4 and ei3 respectively for TairE5 >= 0 and TairE5 < 0) with enhancement factors fw4 and fi4 from Buck (1981) [9]
Step 2
Compute ERA5 saturated water content (SWCE5) and water content (WCE5) using ERA5 surface pressure (PSurfE5) and, respectively, SVPE5 and VPE5 with the following formulas:
Step 3
Compute ERA5 implied relative humidity (RHE5) as:
2) Computing WFDE5 saturated vapour pressure
Here, WFDE5 saturated vapour pressure (SVPW5) is derived from WFDE5 surface pressure (PSurfW5) and WFDE5 surface air temperature (TairW5) using the same formulas applied in Step 1 above, i.e.:
with:
where:
- TairW5: WFDE5 surface air temperature (C)
- PSurfW5: WFDE5 surface pressure (hPa)
- Coefficient values are the same reported in Step 1 above
3) Computing WFDE5 saturated water content
Here, WFDE5 saturated water content (SWCW5) is derived from WFDE5 surface pressure (PSurfW5) and SVPW5,Tair using the following formula:
4) Computing WFDE5 specific humidity at surface
Finally, WFDE5 specific humidity at surface (Qair) is derived from SWCW5 and RHE5 using the formula:
References
[1] G. P. Weedon, S. Gomes, P. Viterbo, H. Österle, J. C. Adam, N. Bellouin, O. Boucher and M. Best, "The WATCH Forcing Data 1958-2001: A meteorological forcing dataset for land surface- and hydrological-models", WATCH Technical Report 22, https://publications.pik-potsdam.de/rest/items/item_16400_1/component/file_16401/content, 2010.
[2] G. P. Weedon, S. Gomes, P. Viterbo, W. J. Shuttleworth, E. Blyth, H. Österle, J. C. Adam, N. Bellouin, O. Boucher and M. Best, "Creation of the WATCH Forcing Data and Its Use to Assess Global and Regional Reference Crop Evapouration over Land during the Twentieth Century", Journal of Hydrometeorology, vol. 12, pp. 823-848, doi: 10.1175/2011JHM1369.1, 2011.
[3] Copernicus Climate Change Service (C3S), "ERA5: Fifth generation of ECMWF atmospheric reanalysis of the global climate", C3S Climate Data Store (CDS), 2017, https://cds.climate.copernicus.eu/.
[4] I. Harris, T. J. Osborn, P. Jones and D. Lister, "Version 4 of the CRU TS monthly high-resolution gridded multivariate climate dataset", Sci. Data, doi: 10.1038/s41597-020-0453-3, 2020.
[5] U. Schneider, A. Becker, P. Finger, A. Meyer-Christoffer and M. Ziese, GPCC Full Data Montlhy Product Version 2018 at 0.5°: Monthly Land-Surface Precipitation from Rain-Gauges built on GTS- based and Historical Data, doi: 10.5676/DWD_GPCC/FD_M_V2018_050, 2011.
[6] U. Schneider, A. Becker, P. Finger, E. Rustemeier and M. Ziese, GPCC Full Data Monthly Product Version 2020 at 0.5°: Monthly Land-Surface Precipitation from Rain-Gauges built on GTS-based and Historical Data., doi: 10.5676/DWD_GPCC/FD_M_V2020_050, 2020.
[7] M. Cucchi, G. P. Weedon, A. Amici, N. Bellouin, S. Lange, H. Müller Schmied, H. Hersbach and C. Buontempo, "WFDE5: bias-adjusted ERA5 reanalysis data for impact studies", Earth Syst. Sci. Data, vol. 12, no. 3, pp. 2097–2120, doi: 10.5194/essd-12-2097-2020, 2020.
[8] J. C. Adam and D. P. Lettenmaier, "Adjustment of global gridded precipitation for systematic bias", Journal of Geophysical Research: Atmospheres, doi: 10.1029/2002JD002499, 2003.
[9] A. L. Buck, "New equations for computing vapour pressure and enhancement factor", Journal of Applied Meteorology and Climatology, doi: 10.1175/1520-0450(1981)020<1527:NEFCVP>2.0.CO;2, 1981