There is an application on the Climate Data Store (CDS) which allows users to compute and download selected daily statistics of variables from a number of hourly ERA5 datasets: Daily statistics calculated from ERA5 data

It provides users with a simple tool to obtain ERA5 data aggregated at daily frequency without having to download the original sub-daily resolution data. The ERA5 data is subset to the selected rectangular spatial region of interest and sampled at the selected frequency. Time coordinates can optionally be shifted to a selected time zone. The data is then aggregated to a daily frequency using the selected statistic and returned to the user in a single netCDF file.

It is possible also to use the CDS API to request daily data from the CDS ERA5 daily application as described on this user forum post: 

https://confluence.ecmwf.int/x/BD6kDQ


This knowledge base article shows you how to calculate daily total precipitation using ERA5 data.

Before you continue, make sure you read through knowledge base articles listed below:

You are also supposed to know how to work with Python under Linux, in particular, how to install packages using pip. You are recommended to use the latest release of packages listed here:

  • CDS API (tested with 0.1.1) - required for step 1
  • netCDF4 (tested with 1.4.0) - required for step 2
  • numpy (tested with 1.14.5) - required for step 2

Step-by-step guide

  1. Use script below to download daily total precipitation ERA5 data for 1st and 2nd January 2017. This script will download total precipitation, in hourly steps, from CDS (Climate Data Store). Notice to cover total precipitation for 1st January 2017, we need two days of data.
    1. 1st January 2017 time = 01 - 23  will give you total precipitation data to cover 00 - 23 UTC for 1st January 2017
    2. 2nd January 2017 time = 00 will give you total precipitation data to cover 23 - 24 UTC for 1st January 2017
    #!/usr/bin/env python
    """
    Save as get-tp.py, then run "python get-tp.py".
     
    Input file : None
    Output file: tp_20170101-20170102.nc
    """
    import cdsapi
    
    c = cdsapi.Client()
    r = c.retrieve(
        'reanalysis-era5-single-levels', {
                'variable'    : 'total_precipitation',
                'product_type': 'reanalysis',
                'year'        : '2017',
                'month'       : '01',
                'day'         : ['01', '02'],
                'time'        : [
                    '00:00','01:00','02:00',
                    '03:00','04:00','05:00',
                    '06:00','07:00','08:00',
                    '09:00','10:00','11:00',
                    '12:00','13:00','14:00',
                    '15:00','16:00','17:00',
                    '18:00','19:00','20:00',
                    '21:00','22:00','23:00'
                ],
                'format'      : 'netcdf'
        })
    r.download('tp_20170101-20170102.nc')
  2. Run a second script to calculate daily total precipitation. All it does is to add up 24 values for a given day as describe in step 1.

    #!/usr/bin/env python
    """
    Save as file calculate-daily-tp.py and run "python calculate-daily-tp.py".
     
    Input file : tp_20170101-20170102.nc
    Output file: daily-tp_20170101.nc
    """
    import time, sys
    from datetime import datetime, timedelta
    
    from netCDF4 import Dataset, date2num, num2date
    import numpy as np
    
    day = 20170101
    d = datetime.strptime(str(day), '%Y%m%d')
    f_in = 'tp_%d-%s.nc' % (day, (d + timedelta(days = 1)).strftime('%Y%m%d'))
    f_out = 'daily-tp_%d.nc' % day
    
    time_needed = []
    for i in range(1, 25):
        time_needed.append(d + timedelta(hours = i))
    
    with Dataset(f_in) as ds_src:
        var_time = ds_src.variables['time']
        time_avail = num2date(var_time[:], var_time.units,
                calendar = var_time.calendar)
    
        indices = []
        for tm in time_needed:
            a = np.where(time_avail == tm)[0]
            if len(a) == 0:
                sys.stderr.write('Error: precipitation data is missing/incomplete - %s!\n'
                        % tm.strftime('%Y%m%d %H:%M:%S'))
                sys.exit(200)
            else:
                print('Found %s' % tm.strftime('%Y%m%d %H:%M:%S'))
                indices.append(a[0])
    
        var_tp = ds_src.variables['tp']
        tp_values_set = False
        for idx in indices:
            if not tp_values_set:
                data = var_tp[idx, :, :]
                tp_values_set = True
            else:
                data += var_tp[idx, :, :]
            
        with Dataset(f_out, mode = 'w', format = 'NETCDF3_64BIT_OFFSET') as ds_dest:
            # Dimensions
            for name in ['latitude', 'longitude']:
                dim_src = ds_src.dimensions[name]
                ds_dest.createDimension(name, dim_src.size)
                var_src = ds_src.variables[name]
                var_dest = ds_dest.createVariable(name, var_src.datatype, (name,))
                var_dest[:] = var_src[:]
                var_dest.setncattr('units', var_src.units)
                var_dest.setncattr('long_name', var_src.long_name)
    
            ds_dest.createDimension('time', None)
            var = ds_dest.createVariable('time', np.int32, ('time',))
            time_units = 'hours since 1900-01-01 00:00:00'
            time_cal = 'gregorian'
            var[:] = date2num([d], units = time_units, calendar = time_cal)
            var.setncattr('units', time_units)
            var.setncattr('long_name', 'time')
            var.setncattr('calendar', time_cal)
    
            # Variables
            var = ds_dest.createVariable(var_tp.name, np.double, var_tp.dimensions)
            var[0, :, :] = data
            var.setncattr('units', var_tp.units)
            var.setncattr('long_name', var_tp.long_name)
    
            # Attributes
            ds_dest.setncattr('Conventions', 'CF-1.6')
            ds_dest.setncattr('history', '%s %s'
                    % (datetime.now().strftime('%Y-%m-%d %H:%M:%S'),
                    ' '.join(time.tzname)))
    
            print('Done! Daily total precipitation saved in %s' % f_out)
    
For simplicity, data in the output NetCDF file of the second script is unpacked. You may want to pack the data to save some disk spaces. Refer to https://www.unidata.ucar.edu/software/netcdf/workshops/2010/bestpractices/Packing.html for detailed information.

This document has been produced in the context of the Copernicus Climate Change Service (C3S).

The activities leading to these results have been contracted by the European Centre for Medium-Range Weather Forecasts, operator of C3S on behalf of the European Union (Delegation Agreement signed on 11/11/2014 and Contribution Agreement signed on 22/07/2021). All information in this document is provided "as is" and no guarantee or warranty is given that the information is fit for any particular purpose.

The users thereof use the information at their sole risk and liability. For the avoidance of all doubt , the European Commission and the European Centre for Medium - Range Weather Forecasts have no liability in respect of this document, which is merely representing the author's view.