Introduction

Global Climate Models (GCM) can provide reliable climate information on global, continental and large regional scales covering what could be a vastly differing landscape (from very mountainous to flat coastal plains for example) with greatly varying potential for floods, droughts or other extreme events. Horizontal resolution limits the possibility to address smaller scale ranging from regional to local. Regional Climate Models (RCM) applied with higher spatial resolution over a limited area and driven by GCMs can provide more appropriate information on such smaller scales supporting more detailed impact and adaptation assessment and planning. Therefore Regional Climate Models (RCMs) have an important role to play by providing projections with much greater detail and more accurate representation of localized extreme events.

Regional climate projections are results from regional climate model simulations which have been generated by multiple independent climate research centres in the framework of the Coordinated Regional Climate Downscaling (CORDEX) supported by the World Climate Research Program (WCRP) and assessed by the Intergovernmental Panel on Climate Change (IPCC). These regional climate projections underpin the conclusion of the IPCC 5th Assessment Report (published in 2003) that “Continued emission of greenhouse gases will cause further warming and long-lasting changes in all components of the climate system, increasing the likelihood of severe, pervasive and irreversible impacts for people and ecosystems”.

The regional climate projections in the Climate Data Store (CDS) are a quality-controlled subset of the wider (CORDEX) dataset over Europe. The CORDEX vision is to advance and coordinate the science and application of regional climate downscaling through global partnerships. It aims to evaluate regional climate model performance through a set of experiments aiming at producing regional climate projections. The goals of CORDEX are:

  1. To better understand relevant regional/local climate phenomena, their variability and changes, through downscaling,
  2. To evaluate and improve regional climate downscaling models and techniques,
  3. To produce coordinated sets of regional downscaled projections worldwide,
  4. To foster communication and knowledge exchange with users of regional climate information.

A set of 26 core variables (17 for non-European domains, corresponding to surface fields, see the table below) from the CORDEX archive were identified for the CDS. These are the most used of the CORDEX data. These variables are provided from 5 CORDEX experiment types (evaluation, historical and 3 RCP scenarios)  that are derived (downscaled) from the CMIP5 experiments. 3-hourly, daily and monthly information (GL Please check), whereas only daily information is provided for non-European domains. 

ANDRAS: I think, we have to indicate here, that for the non-European domain we provide lass variables and maybe in the table below indicate, which ones are only for the EURO-CORDEX or meMed-CORDEX domains.

GL: That's difficult and also depends on the RCMs. I propose to at least indicate into the table of variable the one we "try" to provide for each domain (basically one additional column per domain).

ANDRAS: I thought the non-European domains will use the same set of variables, is that a wrong assumption? Jose, would you check this, please?

JOSE: The non-European domains include the set of 15 variables (plus land-sea mask and orography), when available (not all models provide all variables).

The CDS subset of CORDEX data have been through a metadata quality control procedure which ensures a high standard of reliability of the data. It may be for example that similar data can be found in the main CORDEX archive at the ESGF (Earth System Grid Federation) however these data come with no quality assurance and may have metadata errors or omissions. The quality-control process means that the CDS subset of CORDEX data is further reduced to exclude data that have metadata errors or inconsistencies. It is important to note that passing of the quality control should not be confused with validity: for example, it will be possible for a file to have fully compliant metadata but contain gross errors in the data that have not been noted. In other words, it means that the quality control is purely technical and does not contain any scientific evaluation (for instance consistency check).

ANDRAS: I think, here we have to mention that we also publish data, which had not been available so far in the ESGF. GL : Is there any dataset in this case? Euro-CORDEX comes from the ESGF, even the new simulations from PRINCIPLES that are published on the ESGF first. Med-CORDEX will be published on the ESGF in a second step after the 34b Lot 1. And for 34d data, non-ESGF data will be published on the ESGF and ESGF data have been just Qc-ed for CDS. So I don't see any dataset that exist on the CDS which is/will be not on the ESGF. ANDRAS: OK, maybe some text as proposed below would be sufficient.

We can mention the additional effort (of 34d) to find the additional data for non-European domains. Maybe the link to the IPCC Atlas should be also mentioned. GL: I agree. Jose, would you provide some additional text about the IPCC Atlas, a link, possibly?

JOSE: I included that information below, after describing the efforts (and funding) devoted to support CORDEX activities. Unfortunately there is no link yet since the information regarding the report is confidential (to prevent leaking) until it is releases (in July 2021).

Additional efforts (and funding) were devoted to support CORDEX activities by 1) providing support to archive in the ESGF relevant simulations available from the modeling centers for non-European domains not published in the ESGF earlier, and 2) making new simulations for the EURO-CORDEX domains. These activities  which are contributing to a significant enhancement of the regional climate model matrix over different domains in terms of emission scenarios, global model forcing and regional climate models. For the non-European domains resources were put into finding simulations, which were not available before (and not published in the ESGF earlier).

The effort done by COPERNICUS to consolidate a World-wide CORDEX dataset is also contributing to the IPCC-AR6 WGI activities, providing a curated dataset to be assessed together with global climate information from CMIP experiments, in particular in the Interactive Atlas, a new product of the IPCC allowing exploration of observed and projected climate data to complement the assessment of relevant datasets undertaken in the WG I chapters.

In addition, CORDEX data for CDS includes Persistent IDentifiers (PID) in their metadata which allows CDS users to report any error during the scientific analysis. The error will be at least documented on the ESGF Errata Service (http://errata.es-doc.org)but also planned to be documented in the CDS. The CDS aims to publish only the latest versions of the datasets.

Domains

We are aiming at publishing various CORDEX domains for the entire World. The CDS-CORDEX subset at the moment consists of the Europe (EURO), Mediterranean (MED), North America (NAM) and Arctic (ARC) CORDEX domains. More details of the entire list of CORDEX domains can be found at https://cordex.org/domains/; additionally more details for the EURO-CORDEX activities are available at https://www.euro-cordex.net/

Please note that the domains are not on regular grids. Projections may differ depending on the domain and the Regional Climate Model (RCM). The coordinates below are the maximum and minimum values of the domain window. As a summary, the available domains are:

NameShort nameSouthernmost latitudeNorthernmost latitudeWesternmost longitudeEasternmost longitudeHorizontal resolution (degrees)
EuropeEUR-1127°N72°N22°W45°E0.11° x 0.11°
MediterraneanMED-1125°N52°N21°W50°E0.11° x 0.11°
MED-4425°N52°N21°W50°E0.44° x 0.44°
North AmericaNAM-2212°N59°N171°W24°W0.22° x 0.22°
NAM-4412°N59°N171°W24°W0.44° x 0.44°
ArcticARC-2246°N90°N180°W180°E0.22° x 0.22°
ARC-4446°N90°N180°W180°E0.44° x 0.44°


Experiments

The CDS-CORDEX subset consists of the following CORDEX experiments partly derived from the CMIP5 ones:

Driving Global Climate Models and Regional Climate Models

Regional Climate Model (RCM) simulations needs lateral boundary conditions from Global Climate Models (GCMs). At the moment the CDS-CORDEX subset boundary conditions are extracted from CMIP5 global projections. In general the CORDEX framework requires each RCM downscale a minimum of 3 GCMs for 2 scenarios (at least RCP8.5 and RCP2.6). ANDRAS: please check this, if this is still valid for the non-European domains. Jose, would you check this, please? 

JOSE: I think that was not mandatory in the experimental proposal so we are not constraining the simulations on that (we include all simulations which provide historial + some scenario), so I propose removing this sentence.

The C3S-EURO-CORDEX subset aims to fill the gaps in this matrix between GCMs (aka "driving models), RCMs and RCPs. This will ensure better representation of uncertainties coming from GCMs, RCMs and RCP scenarios and make possible to study the regional climate change signals in a more comprehensive fashion. 

The driving GCM and RCM models included in the CDS-EURO-CORDEX CDS-CORDEX subsets for the different domains available are detailed in the table below. Note that the ensembles for different domains are formed by different GCM and RCM combinations from the main CMIP5 and CORDEX archives, respectively: These include 8 GCMs and 13 RCMs for EURO-CORDEX, 8 GCMs and 8 RCMs for North-America CORDEX, and 5 GCMs and 6 RCMs for the Arctic. Please note that a small number of models were not included as those data have a research-only restriction on their use, while the data presented in the CDS are released without any restriction. 

JOSE: I think the GCM and RCM numbers for Europe need to be updated. GL please check.

ANDRAS: we will need an additional paragraph here for the non-European domains and of course similar tables for the other domains below the EURO-CORDEX one. 

GL: I let Manuel for the additional paragraph. I will update the table of simulations once we agreed on the different decisions to take (cf. our mail thread about this).

Jose, would you propose an additional paragraph, here?

Jose: I included details in the paragraph above (in red).

JOSE: I see that you haven't included non-comercial datasets at COPERNICUS. I guess this would be the same for all domains (we talked about this a couple of times, but I was not aware that this decision was made; I am in favor, no problem with that, just to know). This has no implications for NAM and ARC (all are unrestricted) but might have for other domains (I will check the impact and let you know). 



Driving Global Coupled Models


HadGEM2-ESEC-EARTHCNRM-CM5NorESM1-MMPI-ESM-LRIPSL-CM5A-MRCanESM2MIROC5

Regional Climate Models

RCA4 (SMHI)111113
11111113
11





CCLM-8-17 (ETH)
11111
11



11





1

crCLIM-v1-1-1 (ETH)




1




1

2








REMO2009 (GERICS)











222







1
REMO2015 (GERICS)11
11



111

11




1
1
RACMO22E (KNMI)1111231111
11
1

1





HIRHAM5 (DMI)
12224

1
22

1








WRF361H (UHOH)













1








WRF381P (IPSL)

1

1

1

1



11





ALADIN53 (CNRM)





111














ALADIN63 (CNRM)

1


111




1








RegCM4.6.1 (ICTP)























HadGEM3-GA7-05 (MOHC)

1

1












































RCP26RCP45RCP85
[0-9] = Number of simulations

The 13 Regional Climate Models that ran simulations over European domain will be documented through the Earth-System Documentation (ES-DOC) which provides a standardised and easy way to document climate models.

Dataset numbers (simulation version)

ANDRAS: I think, this can be a bit confusing, since above we mention that we publish only the latest version. So somehow we have to explain clearly what is the difference between model version and dataset number.

GL: As said in my email, this could be delegate to the Errata Service and we can remove this paragraph. What do you think? I will create an issue on the Errata test instance to show how it looks like.

On a general level in the CDS form for the RCM simulations “v” enumerates runs and NOT model versions. For the DMI, KNMI and SMHI runs numbers different from “v1” means new simulations relative to the first “v1” one. It might not mean a new version. Hereafter we describe the meaning of the different dataset numbers for those models, which have some.

Ensembles

The boundary conditions used to run a RCM are also identified by the model member if the CMIP5 simulation used. Each modeling centre typically run the same experiment using the same GCM several times to confirm the robustness of results and inform sensitivity studies through the generation of statistical information. A model and its collection of runs is referred to as an ensemble. Within these ensembles, three different categories of sensitivity studies are done, and the resulting individual model runs are labelled by three integers indexing the experiments in each category. 

Each member of an ensemble is identified by a triad of integers associated with the letters r, i and p which index the “realization”, “initialization” and “physics” variations respectively. For instance, the member "r1i1p1" and the member "r1i1p2" for the same model and experiment indicate that the corresponding simulations differ since the physical parameters of the model for the second member were changed relative to the first member. 

It is very important to distinguish between variations in experiment specifications, which are globally coordinated across all the models contributing to CMIP5, and the variations which are adopted by each modeling team to assess the robustness of their own results. The “p” index refers to the latter, with the result that values have different meanings for different models, but in all cases these variations must be within the constraints imposed by the specifications of the experiment. 

For the scenario experiments, the ensemble member identifier is preserved from the historical experiment providing the initial conditions, so RCP 4.5 ensemble member “r1i1p2” is a continuation of historical ensemble member “r1i1p2”.

For CORDEX data, the ensemble member is equivalent to the ensemble member of the CMIP5 simulation used to extract boundary conditions.

List of published parameters ANDRAS: maybe here we can indicate with bold face those variables, which are available only for the EURO-CORDEX domain GL : see my above comment on this.

Jose: I have done that and included the paragraph below. Please check.

The table below lists the variables provided (in boldface those provided for all domains, the rest are provided for Europe) at 3-hourly, daily and monthly temporal scale (only daily for non European domains and for tasmin and tasmax for all domains). Note that sftlf and orog are static time independent fields.

NameShort nameUnitsDescription
2m temperaturetasKThe temperature of the air near the surface (or ambient temperature). The data represents the mean over the aggregation period at 2m above the surface.
200hPa temperatureta200KThe temperature of the air at 200hPa. The data represents the mean over the aggregation period at 200hPa pressure level.
Minimum 2m temperature in the last 24 hourstasminKThe minimum temperature of the air near the surface. The data represents the daily minimum over the aggregation period at 2m above the surface. ANDRAS: I guess this variable is available only for daily data, is that correct? Please check! JOSE: Yes for 34d (all are daily)
Maximum 2m temperature in the last 24 hourstasmaxKThe maximum temperature of the air near the surface. The data represents the daily maximum over the aggregation period at 2m above the surface. ANDRAS: I guess this variable is available only for daily data, is that correct? Please check! JOSE: Yes for 34d (all are daily)
Mean precipitation fluxprkg.m-2.s-1The deposition of water to the Earth's surface in the form of rain, snow, ice or hail. The precipitation flux is the mass of water per unit area and time. The data represents the mean over the aggregation period.
2m surface relative humidityhurs%

The relative humidity is the percentage ratio of the water vapour mass to the water vapour mass at the saturation point given the temperature at that location. The data represents the mean over the aggregation period at 2m above the surface.

2m surface specific humidityhussDimensionlessThe amount of moisture in the air at 2m above the surface divided by the amount of air plus moisture at that location. The data represents the mean over the aggregation period at 2m above the surface.
Surface pressurepsPa

The air pressure at the lower boundary of the atmosphere. The data represents the mean over the aggregation period.

Mean sea level pressurepslPaThe air pressure at sea level. In regions where the Earth's surface is above sea level the surface pressure is used to compute the air pressure that would exist at sea level directly below given a constant air temperature from the surface to the sea level point. The data represents the mean over the aggregation period.
10m Wind SpeedsfcWindm.s-1The magnitude of the two-dimensional horizontal air velocity. The data represents the mean over the aggregation period at 10m above the surface.
Surface solar radiation downwardsrsdsW.m-2The downward shortwave radiative flux of energy per unit area. The data represents the mean over the aggregation period at the surface.
Surface thermal radiation downwardrldsW.m-2

The downward longwave radiative flux of energy inciding on the surface from the above per unit area. The data represents the mean over the aggregation period.

Surface upwelling shortwave radiationrsusW.m-2

The upward shortwave radiative flux of energy from the surface per unit area. The data represents the mean over the aggregation period at the surface.

Total cloud covercltDimensionlessTotal refers to the whole atmosphere column, as seen from the surface or the top of the atmosphere. Cloud cover refers to fraction of horizontal area occupied by clouds. The data represents the mean over the aggregation period.
500hPa geopotentialzg500mThe gravitational potential energy per unit mass normalized by the standard gravity at 500hPa at the same latitude. The data represents the mean over the aggregation period at 500hPa pressure level.
10m u-component of winduasm.s-1The magnitude of the eastward component of the wind. The data represents the mean over the aggregation period at 10m above the surface.
10m v-component of windvasm.s-1The magnitude of the northward component of the wind. The data represents the mean over the aggregation period at 10m above the surface.
200hPa u-component of the windua200m.s-1

The magnitude of the eastward component of the wind. The data represents the mean over the aggregation period at 200hPa above the surface.

200hPa v-component of the windva200m.s-1The magnitude of the northward component of the wind. The data represents the mean over the aggregation period at 200hPa pressure level.
850hPa U-component of the windua850m.s-1The magnitude of the eastward component of the wind. The data represents the mean over the aggregation period at 850hPa pressure level.
850hPa V-component of the windva850m.s-1The magnitude of the northward component of the wind. The data represents the mean over the aggregation period at 850hPa pressure level.
Total run-off fluxmrrokg.m-2.s-1

The mass of surface and sub-surface liquid water per unit area and time, which drains from land. The data represents the mean over the aggregation period.

Mean evaporation fluxevspsblkg.m-2.s-1

The mass of surface and sub-surface liquid water per unit area ant time, which evaporates from land. The data includes conversion to vapour phase from both the liquid and solid phase, i.e., includes sublimation, and represents the mean over the aggregation period.

Land area fractionsftlf%The percentage of the surface occupied by land, aka land/sea mask. The data  is time-independent.
OrographyorogmThe surface altitude in the model. The data is time-independent.

Data Format

The CDS subset of CORDEX data are provided as NetCDF files. NetCDF (Network Common Data Form) is a file format that is freely available and commonly used in the climate modeling community. See the more details: What are NetCDF files and how can I read them

A CORDEX NetCDF file in the CDS contains: 

The metadata provided in NetCDF files adhere to the Climate and Forecast (CF) conventions. The rules within the CF-conventions ensure consistency across data files, for example ensuring that the naming of variables is consistent and that the use of variable units is consistent.

File naming conventions ANDRAS: please check, if this convention is also held for the non-European domains. I also think that there might be some amendment needed, because of the domain name and the resolution. Jose, would you check this, please?

When you download a CORDEX file from the CDS it will have a naming convention that is as follows:

<variable>_<domain>_<driving-model>_<experiment>_<ensemble_member>_<rcm-model>_<rcm-run>_<time-frequency>_<temporal-range>.nc

JOSE: The CORDEX DRS indicates that the name of the institution should be an appendix of the <driving-model> and <rcm-model>. In 34d we are homogenizing this, including the institution in cases that it is missing. Thus, we have "ICHEC-EC-EARTH_r12i1p1" and not "EC-EARTH_r12i1p1" and also "NCAR-WRF_v3.5.1" and not "WRF_v3.5.1". Therefore the GCMs and the RCMs will be listed in the widget in the full form. We could change that if needed, but we would need to keep coherence (some of the current ESGF datasets use one approach and some use the other; that will be harmonized for the CDS). The advantage of leaving the institution is that we could have the same RCM driven by different institutions and that is useful information for users. Maybe the institution name in the case of the GCM can be dropped (it is less relevant and may be confusing, since there are many cases where the model has the same name as the institution).

Where

Quality control of the CDS-CORDEX subset ANDRAS: any additional information for QC of the non-European domains, particularly those, who were not in the ESGF? GL: 34d has a beautiful diagram on the differnet QC steps that could be interesting to copy-paste here. Jose, would you add this here, please?

The CDS subset of the CORDEX data have been through a set of quality control checks before being made available through the CDS. The objective of the quality control process is to ensure that all files in the CDS meet a minimum standard. Data files were required to pass all stages of the quality control process before being made available through the CDS. Data files that fail the quality control process are excluded from the CDS-CORDEX subset or if possible the error is corrected and a note made in the history attribute of the file. The quality control of the CDS-CORDEX subset checks for metadata errors or inconsistencies against the Climate and Forecast (CF) Conventions and a set of CORDEX specific file naming and file global metadata conventions.

Various software tools have been used to check the metadata:

The figure below shows a scheme that classifies the tests performed by the QA-DKRZ tool in twelve categories (in green) showing the specific tests/checks in each case.

The data within the files were not individually checked, therefore it is important to note that passing of these quality control tests should not be confused with validity: for example, it will be possible for a file to be fully CF compliant and have fully compliant metadata but contain gross errors in the data that have not been revealed.

Known issues

Background documents and user guides

There is a very useful User Guide prepared by the EURO-CORDEX community which is providing guidance how to use EURO-CORDEX climate projection data. Please note that the data download part of this document at this stage refers only to access the data from the ESGF directly. Certainly the data can be also downloaded from the CDS and this information will be soon provided in that document. This EURO-CORDEX User Guide is available at https://www.euro-cordex.net/imperia/md/content/csc/cordex/euro-cordex-guidelines-version1.0-2017.08.pdf

The documents below were provided by the data supplier as background information on the creation of the CORDEX data stored in the Climate Data Store (CDS) for the benefit of the CORDEX data users.

This report documents how the experiment is designed in terms of which GCM-RCM-RCP combinations to run in the project. This was produced quite some time ago, therefore the presented information is not fully up-to-date, but nevertheless provides a fairly good idea about the concept for designing new experiments. 

This report documents results from the 34 EURO-CORDEX RCP8.5 simulations. For a number of European subregions we present patterns describing the regional climate change in relation to the change in global mean temperature. These patterns are derived as the linear fit between regional climate change and change in global mean temperature. This is a commonly used method and can be seen as the standard definition of pattern scaling used in the scientific literature. For the calculation of these patterns the climate change signal was derived for three different time windows (2011-2040, 2041-2070 and 2071-2100) w.r.t. the control climate (1971-2000)

Internal variability is an intrinsic character of the climate system and it is also present in climate models. The design to run new RCM experiments took into account the intention that the internal variability can be studied. This report present some early investigations on these aspects. 

In this report we review the state of the EURO-CORDEX ensemble as valid at the beginning of 2020. The report indicates climate change findings what can be deduced with the help of the larger RCM ensemble available. Regular updates of such reports are planned and the document will be updated here.

C3S is aiming to build a EURO-CORDEX ensemble which is as complete as possible. By doing this, C3S will fill some of the missing elements of the EURO-CORDEX GCM-RCM-RCP uncertainty matrix. As we will have more simulations available (and these being complete sub-matrices, for instance), we are in a better position to assess how the full matrix can be reproduced when based on fewer available model simulations. In addition, we can determine how the missing model elements can be built. This unique study gives valuable insights into the optimal design of such ensemble systems in the future.

References

This document has been produced in the context of the Copernicus Climate Change Service (C3S).
The activities leading to these results have been contracted by the European Centre for Medium-Range Weather Forecasts, operator of C3S on behalf of the European Union (Delegation agreement signed on 11/11/2014). All information in this document is provided "as is" and no guarantee or warranty is given that the information is fit for any particular purpose.
The user thereof uses the information at its sole risk and liability. For the avoidance of all doubts, the European Commission and the European Centre for Medium-Range Weather Forecasts has no liability in respect of this document, which is merely representing the authors view.

Related articles