Important information on CEMS Data Access via CDS

ECMWF has implemented a new state-of-the-art data access infrastructure to host the Climate and Atmospheric Data Stores (CDS and ADS, respectively). All layers of the infrastructure are being modernised: the front-end web interface, the back-end software engine, and the underlying cloud infrastructure hosting the service and core data repositories.

As part of this development, a new data store for the Copernicus Emergency Management Service (CEMS) has been created. The CEMS Early Warning Data Store (EWDS) will host all the historical and forecast information for floods and forest fires at European and Global levels. Users are encouraged to migrate their accounts and scripts to the new EWDS Beta before 26 September 2024, when the system will become operational.

For more information, Please read: CEMS Early Warning Data Store (EWDS) is now live!

Updates to the EWDS documentation are ongoing as the implementation takes place.

The EWDS API is a Python service that enables access to CEMS-Flood data on the EWDS.  It is ideal for users that retrieve large volumes of data or need to automate tasks. This page collects a number of scripts that can work as blueprints for more user-specific requests.


EWDS API Installation

Instructions about the installation and set-up of the EWDS API can be found in How to use the EWDS API.

A user will indicate the data they wish to download by using the radio buttons on the 'Data Download' tab of their chosen dataset on the EWDS. After a selection is made on the form, to generate the API request click the 'Show API request' button. This will show the python code to be used to download the data of the bottom of the form.

How to run the scripts:

You should copy the content of the script into a python file (ex: retrieve_<dataset>.py) and then launch it from a terminal:

user@host:~$ python retrieve_<dataset>.py


API script examples:

The following are some examples of API scripts to download the various CEMS-Floods datasets from the EWDS.

  • In the EWDS the keyword 'format' has been replaced by two keywords: 'data_format' which can be 'netcdf' or 'grib', and 'download_format' which can be 'zip' or 'unarchived'. This is to allow users more flexibility in the format of the returned data. 
  • In the EWDS the hmonth/month value format has been changed from a full month name ["January"]  to numeric months format ["01"] .
  • In the legacy CDS, data would include the boundary points of the specified area, whereas in the EWDS, the data consists of all the points inside the specified area. Please ensure that all the points of your region of interest are included in the bounding box.

An example request:

import cdsapi

dataset = "efas-historical"
request = {
    "system_version": ["version_5_0"],
    "variable": ["river_discharge_in_the_last_6_hours"],
    "model_levels": "surface_level",
    "hyear": ["2023"],
    "hmonth": ["01"],
    "hday": [
        "01","02","03","04","05","06",
        "07","08","09","10","11","12",
        "13","14","15","16","17","18",
        "19","20","21","22","23","24",
        "25","26","27","28","29","30",
        "31"
    ],
    "time": ["00:00"],
    "data_format": "netcdf",
    "download_format": "zip"
}

client = cdsapi.Client()
client.retrieve(dataset, request).download()

This will return a zipped folder containing a netcdf file called 'data_version-5.nc'.

EFAS Medium-range climatology

## === retrieve EFAS Medium-Range Climatology === 
import cdsapi


if __name__ == '__main__':

	c = cdsapi.Client()
	DATASET='efas-historical'

	VARIABLES = [
			'river_discharge_in_the_last_6_hours', 'snow_depth_water_equivalent',
	            ]

	YEARS = ['%02d'%(mn) for mn in range(1992,2023)]
	MONTHS = ['01', '02', '03', '04', '05', '06', '07', '08', '09', '10', '11', '12']
	DAYS = ['%02d'%(mn) for mn in range(1,32)]

	for variable in VARIABLES:
		for year in YEARS:
			REQUEST={
					'system_version': 'version_5_0',
					'variable': variable,
					'model_levels': 'surface_level',
					'hyear': year,
					'hmonth': MONTHS,
					'hday': DAYS,
					'time': '00:00',
					"data_format": "grib",
					"download_format": "zip"
				     }
			c.retrieve(DATASET, REQUEST).download(f'{DATASET}_{variable}_{year}.zip')

EFAS Medium-range forecast

## === retrieve EFAS Medium-Range Forecast ===
 
import cdsapi
import datetime
 
def compute_dates_range(start_date,end_date,loop_days=True):
 
    start_date = datetime.date(*[int(x) for x in start_date.split('-')])
     
    end_date = datetime.date(*[int(x) for x in end_date.split('-')])
     
    ndays =  (end_date - start_date).days + 1
     
    dates = []
    for d in range(ndays):
        dates.append(start_date + datetime.timedelta(d))
     
    if not loop_days:
        dates = [i for i in dates if i.day == 1]
    else:
        pass
    return dates
 
if __name__ == '__main__':
 
 
    # start the client
    
    c = cdsapi.Client()
    
    # user inputs
    
    DATASET='efas-forecast'
    START_DATE = '2020-10-14' # first date with available data
    END_DATE = '2022-10-28'
    LEADTIMES =  [str(lt) for lt in range(0,372,6)]
 
    # loop over dates and save to disk
 
    dates = compute_dates_range(START_DATE,END_DATE)
 
    for date in dates:
 
        year  = date.strftime('%Y')
        month = date.strftime('%m')
        day   = date.strftime('%d')
 
        print(f"RETRIEVING: {year}-{month}-{day}-{DATASET}")
 
        REQUEST={
                'originating_centre':'ecmwf',
                'product_type':'ensemble_perturbed_forecasts',
                'variable': 'river_discharge_in_the_last_6_hours',
                'model_levels': 'surface_level',
                'year': year,
                'month': month,
                'day': day,
                'leadtime_hour':LEADTIMES,
                'time': '12:00',
                'data_format': "grib",
                'download_format': "unarchived"
                 }
        c.retrieve(DATASET, REQUEST).download(f'{DATASET}_{year}_{month}_{day}.grib')

GloFAS Medium-range climatology

## === retrieve GloFAS Medium-Range Climatology ===
 
import cdsapi
 
 
if __name__ == '__main__':
    c = cdsapi.Client()
    
    DATASET='cems-glofas-historical'
    YEARS  = ['%02d'%(mn) for mn in range(1979,2023)]
 
    MONTHS = ['01', '02', '03', '04', '05', '06', '07', '08', '09', '10', '11', '12']
    DAYS   = ['%02d'%(mn) for mn in range(1,32)]
 
 
    for year in YEARS:
            REQUEST={
                'system_version':'version_4_0',
                'product_type': 'consolidated',
                'hydrological_model': 'lisflood',
                'variable': 'river_discharge_in_the_last_24_hours',
                'hyear': year,
                'hmonth': MONTHS,
                'hday': DAYS,
                'data_format': "grib",
                'download_format': "zip"
                     }
            c.retrieve(DATASET, REQUEST).download(f'{DATASET}_{year}.zip')

GloFAS Medium-range forecast

## === retrieve GloFAS Medium-Range Forecast ===
 
import cdsapi
import datetime
import warnings
 
 
 
def compute_dates_range(start_date,end_date,loop_days=True):
 
 
    start_date = datetime.date(*[int(x) for x in start_date.split('-')])
     
    end_date = datetime.date(*[int(x) for x in end_date.split('-')])
     
    ndays =  (end_date - start_date).days + 1
     
    dates = []
    for d in range(ndays):
        dates.append(start_date + datetime.timedelta(d))
     
    if not loop_days:
        dates = [i for i in dates if i.day == 1]
    else:
        pass
    return dates
 
 
 
if __name__ == '__main__':
 
 
    # start the client
    c = cdsapi.Client()
 
 
    # user inputs
    DATASET='cems-glofas-forecast'
    
    START_DATE = '2021-05-26'
 
    END_DATE = '2024-10-01'
 
    LEADTIMES =  [str(lt) for lt in range(24,744,24)]
 
 
    # loop over dates and save to disk
 
    dates = compute_dates_range(START_DATE,END_DATE)
 
    for date in dates:
 
        year  = date.strftime('%Y')
        month = date.strftime('%m')
        day   = date.strftime('%d')
 
        print(f"RETRIEVING: {year}-{month}-{day}-{DATASET}")
        REQUEST={
                'system_version':'operational',
                'hydrological_model': 'lisflood',
                'product_type':'control_forecast',
                'variable': 'river_discharge_in_the_last_24_hours',
                'year': year,
                'month': month,
                'day': day,
                'leadtime_hour':LEADTIMES,
                'data_format': "grib2",
                'download_format': "zip"
                 }
        c.retrieve(DATASET, REQUEST).download(f'{DATASET}_{year}_{month}_{day}.zip')

GloFAS Medium-range reforecast

## === retrieve GloFAS Medium-Range Reforecast ===
  
## === subset India, Pakistan, Nepal and Bangladesh region ===
  
  
import cdsapi
from datetime import datetime, timedelta

    
def get_monthsdays():
    start, end = datetime(2024, 1, 1), datetime(2024,1, 31) # reference year 2024
    days = [start + timedelta(days=i) for i in range((end - start).days + 1)]
    monthday = [d.strftime("%m-%d").split("-") for d in days if d.weekday() in [0, 3]]
  
    return monthday

MONTHSDAYS = get_monthsdays()

  
if __name__ == '__main__':
    c = cdsapi.Client()
      
    # user inputs
    DATASET='cems-glofas-reforecast'
    BBOX = [35 ,-5, 30, 5] # North West South East
    YEARS  = ['%d'%(y) for y in range(2022,2023)]
    LEADTIMES = ['%d'%(l) for l in range(24,1128,24)]
      
    # submit request
    for md in MONTHSDAYS:
  
        month = md[0].lower()
        day = md[1]
  
        REQUEST= {
                'system_version': ["version_4_0"],
                'variable': 'river_discharge_in_the_last_24_hours',
                'hydrological_model': 'lisflood',
                'product_type': 'control_reforecast',
                'area': BBOX,# < - subset
                'hyear': YEARS,
                'hmonth': month,
                'hday': day,
                'leadtime_hour': LEADTIMES,
                'data_format': "grib2",
                'download_format': "zip"
                 }
        c.retrieve(DATASET, REQUEST).download(f'{DATASET}_{month}_{day}.zip')


GloFAS Seasonal forecast

## === retrieve GloFAS Seasonal Forecast ===

import cdsapi

if __name__ == '__main__':
    c = cdsapi.Client()
    
    # user inputs
    DATASET = 'cems-glofas-seasonal'
    
    YEARS = ['%d' % (y) for y in range(2022, 2023)]
    MONTHS = ['%02d' % (m) for m in range(1, 13)]
    LEADTIMES = ['%d' % (l) for l in range(24, 2976, 24)]


    for year in YEARS:
        print(f'year_{year}')
        for month in MONTHS:
            print(f'Month_{month}')
            REQUEST = {
                'system_version': ['operational'],
                "hydrological_model": ["lisflood"],
                'variable': 'river_discharge_in_the_last_24_hours',
                'year': year,
                'month': '12' if year == '2020' else month,
                'leadtime_hour': LEADTIMES,
                'area': [90, -180, -90, 180],
                'data_format': 'grib2',
                'download_format': 'unarchived'
            }
            c.retrieve(DATASET, REQUEST).download(f'{DATASET}_{year}_{month}.grib')

GloFAS Seasonal reforecast

## === retrieve GloFAS Seasonal Reforecast ===
  
## === subset South America/Amazon region ===
  
import cdsapi
  
if __name__ == '__main__':
  
  
    c = cdsapi.Client()
    
    # user inputs
    DATASET='cems-glofas-seasonal-reforecast'
  
    YEARS  = ['%d'%(y) for y in range(1981,2021)]
  
    MONTHS = ['01', '02', '03', '04', '05', '06', '07', '08', '09', '10', '11', '12']
  
    LEADTIMES = ['%d'%(l) for l in range(24,2976,24)]
      
    for year in YEARS:
        for month in MONTHS:
           REQUEST={
                    'system_version': 'version_4_0',
                    'variable':'river_discharge_in_the_last_24_hours',
                    'hydrological_model':'lisflood',
                    'hyear': year,
                    'hmonth': month,
                    'leadtime_hour': LEADTIMES,
                    'area': [ 10.95, -90.95, -30.95, -29.95 ],
                    'data_format': 'netcdf',
                    'download_format': 'unarchived'
                    }
           c.retrieve(DATASET, REQUEST).download(f'{DATASET}_{year}_{month}.netcdf')