Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Expand
titleJupyter notebook demonstrating how the daily statistics are calculated

Daily statistics in the CDS

The following workflow demonstrates how to calculate the daily statistics from ERA5 data with earthkit.transforms. This is the methodology used by the derived daily statistics catalogue entries on the CDS.

Code Block
languagepy
import cdsapi
import xarray as xr
from earthkit.transforms.aggregate import temporal

Download some raw hourly data

Here we choose the ERA5 single levels 2m temperature and the top soil layer temperature data. We have chosen a coarse grid, an area sub-selection and sampled at 6 hours to reduced the amount data downloaded for the demonstration.

Code Block
languagepy
client = cdsapi.Client() 
dataset = "reanalysis-era5-single-levels"
request = {
    'product_type': ['reanalysis'],
    'variable': ['2m_temperature'],
    'date': '20240101/20240131',
    'time': ['00:00', '06:00', '12:00', '18:00'],
    'area': [60, -10, 50, 2],
    'grid': [1,1],
    'data_format': 'grib',
}
result_file = client.retrieve(dataset, request).download()
2024-09-10 15:52:51,773 INFO Request ID is cbe537cd-89ce-412d-9ea2-cd037046d979
2024-09-10 15:52:51,889 INFO status has been updated to accepted
2024-09-10 15:52:55,887 INFO status has been updated to successful
                                                                                       

Open the result file with xarray

The time_dims are specified to be the "valid_time" which is inline with the backend of the CADS post-processing and netCDF conversion.

Code Block
languagepy
ds = xr.open_dataset(
    result_file, time_dims=["valid_time"]
)
print(ds)
<xarray.Dataset> Size: 72kB
Dimensions:     (valid_time: 124, latitude: 11, longitude: 13)
Coordinates:
    number      int64 8B ...
  * valid_time  (valid_time) datetime64[ns] 992B 2024-01-01 ... 2024-01-31T18...
    surface     float64 8B ...
  * latitude    (latitude) float64 88B 60.0 59.0 58.0 57.0 ... 52.0 51.0 50.0
  * longitude   (longitude) float64 104B -10.0 -9.0 -8.0 -7.0 ... 0.0 1.0 2.0
Data variables:
    t2m         (valid_time, latitude, longitude) float32 71kB ...
Attributes:
    GRIB_edition:            1
    GRIB_centre:             ecmf
    GRIB_centreDescription:  European Centre for Medium-Range Weather Forecasts
    GRIB_subCentre:          0
    Conventions:             CF-1.7
    institution:             European Centre for Medium-Range Weather Forecasts
    history:                 2024-09-10T15:52 GRIB to CDM+CF via cfgrib-0.9.1...

Calculate the daily statistic

Use the temporal module from earthkit.transforms.aggregate to calculate the daily statistic of relevance. The API to earthkit.transforms.aggregate aims to be highly flexible to meet the programming styles of as many users as possible. Here we provide a handful of examples, but we encourage users to explore teh earthkit documentation for more examples.

https://earthkit-transforms.readthedocs.io/en/latest/

Code Block
languagepy
titleDaily mean
ds_daily_mean = temporal.daily_mean(ds)
print(ds_daily_mean)
<xarray.Dataset> Size: 18kB
Dimensions:     (valid_time: 31, latitude: 11, longitude: 13)
Coordinates:
    number      int64 8B 0
    surface     float64 8B 0.0
  * latitude    (latitude) float64 88B 60.0 59.0 58.0 57.0 ... 52.0 51.0 50.0
  * longitude   (longitude) float64 104B -10.0 -9.0 -8.0 -7.0 ... 0.0 1.0 2.0
  * valid_time  (valid_time) datetime64[ns] 248B 2024-01-01 ... 2024-01-31
Data variables:
    t2m         (valid_time, latitude, longitude) float32 18kB 281.4 ... 279.3
Attributes:
    GRIB_edition:            1
    GRIB_centre:             ecmf
    GRIB_centreDescription:  European Centre for Medium-Range Weather Forecasts
    GRIB_subCentre:          0
    Conventions:             CF-1.7
    institution:             European Centre for Medium-Range Weather Forecasts
    history:                 2024-09-10T15:52 GRIB to CDM+CF via cfgrib-0.9.1...


Code Block
languagepy
titleDaily standard deviation
ds_daily_std = temporal.daily_std(ds)
print(ds_daily_std)
<xarray.Dataset> Size: 18kB
Dimensions:     (valid_time: 31, latitude: 11, longitude: 13)
Coordinates:
    number      int64 8B 0
    surface     float64 8B 0.0
  * latitude    (latitude) float64 88B 60.0 59.0 58.0 57.0 ... 52.0 51.0 50.0
  * longitude   (longitude) float64 104B -10.0 -9.0 -8.0 -7.0 ... 0.0 1.0 2.0
  * valid_time  (valid_time) datetime64[ns] 248B 2024-01-01 ... 2024-01-31
Data variables:
    t2m         (valid_time, latitude, longitude) float32 18kB 0.157 ... 1.934
Attributes:
    GRIB_edition:            1
    GRIB_centre:             ecmf
    GRIB_centreDescription:  European Centre for Medium-Range Weather Forecasts
    GRIB_subCentre:          0
    Conventions:             CF-1.7
    institution:             European Centre for Medium-Range Weather Forecasts
    history:                 2024-09-10T15:52 GRIB to CDM+CF via cfgrib-0.9.1...

How to handle non-UTC Timezone

To caculate the daily statistics for a non-UTC time zone, we use the time_shift kwarg to specify that we want to shift the time to match the requested timezone. The time_shift can be provided as a dictionary or as a pandas-TimeDelta, we use a dictionay for ease of reading. The example below {"hours": 6} is for the time zone UTC+06:00.

In addition, remove_partial_period is set to True such that the returned result only contains values made up of complete period samples.

These arguements, along with all the other accepted arguments, are fully documented in the earthkit-transforms documentation:

https://earthkit-transforms.readthedocs.io/en/stable/_api/transforms/aggregate/temporal/index.html#transforms.aggregate.temporal.daily_mean

Code Block
languagepy
ds_daily_max = temporal.daily_max(
    ds, time_shift={"hours": 6}, remove_partial_periods=True
)
print(ds_daily_max)
<xarray.Dataset> Size: 18kB
Dimensions:     (valid_time: 30, latitude: 11, longitude: 13)
Coordinates:
    number      int64 8B 0
    surface     float64 8B 0.0
  * latitude    (latitude) float64 88B 60.0 59.0 58.0 57.0 ... 52.0 51.0 50.0
  * longitude   (longitude) float64 104B -10.0 -9.0 -8.0 -7.0 ... 0.0 1.0 2.0
  * valid_time  (valid_time) datetime64[ns] 240B 2024-01-02 ... 2024-01-31
Data variables:
    t2m         (valid_time, latitude, longitude) float32 17kB 282.0 ... 281.5
Attributes:
    GRIB_edition:            1
    GRIB_centre:             ecmf
    GRIB_centreDescription:  European Centre for Medium-Range Weather Forecasts
    GRIB_subCentre:          0
    Conventions:             CF-1.7
    institution:             European Centre for Medium-Range Weather Forecasts
    history:                 2024-09-10T15:52 GRIB to CDM+CF via cfgrib-0.9.1...

Removing partial periods has resulted in the first day being lost from our initial data request, the first value of valid_time is now the 2024-01-02. Similarly, if we had requested a negative time_shift (Westward of UTC), the final day would have been lost.

The derived daily catalogue entries adjust the data request to ensure that all days requested are included in the returned result file.

For the latest version please see here: cads-notebooks/documentation/daily-statistics.ipynb at main · ecmwf-projects/cads-notebooks · GitHub Daily statistics

Data organisation and access

...

a675ea11-b2c4-336c-bfb6-077e786ef5b2For the latest version please see here: Daily accumulation for ERA5-land

Expand
titleDaily accumulation for non-UTC timezone for the ERA5-land accumulated variables

Jupyter Viewer
repoSlugcads-notebooks
branchIdrefs/heads/main
projectKeyCDS
filepathdocumentation/daily_accumulation_for_era5_land.ipynb
applicationLink


Info
iconfalse

This document has been produced in the context of the Copernicus Climate Change Service (C3S).

The activities leading to these results have been contracted by the European Centre for Medium-Range Weather Forecasts, operator of C3S on behalf of the European Union (Delegation Agreement signed on 11/11/2014 and Contribution Agreement signed on 22/07/2021). All information in this document is provided "as is" and no guarantee or warranty is given that the information is fit for any particular purpose.

The users thereof use the information at their sole risk and liability. For the avoidance of all doubt , the European Commission and the European Centre for Medium - Range Weather Forecasts have no liability in respect of this document, which is merely representing the author's view.

...