Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Introduction

The new version of the C3S netCDF encoding standards (C3S-0.3) is an evolution of the existing encoding standards (C3S-0.2) which aims to make them more generic and permissive, in the sense that some of them could be characterized as mandatory or optional depends on operational need. In that way, the encoding standards will be more generic and can be easily applied to projects that haven't been yet operational. 

...

CMIP6 Data Request: MIP variables search

ACDD convention

Change List


Expand


DateChangeversion

 

Initial versionC3S-0.1

 

Update the CF convention to CF-1.11C3S-0.2


Correct atmosphere model to atmos in the source global attribute


Update the controlled vocabulary of the global attribute "institute_id" with the following:

"kwbc" for NCEP

"rjtd" for JMA

"cwao" for ECCC

"ammc" for XXXX



Update the controlled vocabulary of the global attribute "institution" with the following:

"NCEP National Centres for Environmental Prediction"

"JMA Japan Meteorological Agency"

"ECCC, Environment and Climate Change Canada, Montreal, QC, Canada"

"Australian Australian BoM"



Update the controlled vocabulary of the global attribute "level_type" with the following:

"ocean2d"



Spatial coordinates have been updated in order to include additional coordinates used by the ocean products. 

The additional coordinates are: 

"sigma_theta"

"depth" and

"temperature" 



The variables have been updated to include the ocean field. 


  

Global attributes:

  • the attributes "Conventions", "source", "institute_id", "project", "creation_date", "forecast_type", "modeling_realm", "frequency", "level_type", and "forecast_reference_time" become Mandatory attributes. All the rest become Optional. Information about when a global attribute is mandatory or note provided in column 'Required'.
  • The value of the global attribute "title" was updated to describe project data. 
  • The value of the global attribute "source" was updated to describe project data. 
  • The values of the global attribute "institute_id" and "institution" were updated to include additional institutions.
  • The value of the global attribute "project" was updated to describe additional projects.
  • The value of the global attribute "forecast_type" was updated for the needs of analysis data.
  • The value of the global attribute "frequency" was updated including 3 hourly data.
  • The value of the global attribute "level_type" was updated to describe ocean2d data. 
C3S-0.3


Spatial Coordinates: 

  • Mandatory values when it's applicable are described in notes.  
  • The spatial coordinate "depth" was updated for the needs of ocean variables. 
  • Spatial coordinates "sigma_theta" and "temperature" were added to describe ocean variables. 


Discrete Axes:

  • The coordinate variable "realization" was moved to Discrete Axes
  • The coordinate variable "vegetation_type" was added. 


Time Coordinates:

  • The coordinate variable "time" was updated for the needs of analysis data. 


Cell boundaries:

  • Mandatory values when it's applicable are described in notes. 


The encoding of data variables was moved to Appendix I. 



The candidate attributes table has been merged with the common attributes table and the attribute name is now optional






Encoding Guide for netCDF files 

File Formatting

The format of the output products should be netCDF, and conform to the CF metadata standards following the requirements below:   

  • The output files shall be written through the NetCDF API
  • The NETCDF4 _CLASSIC model shall be adopted
  • The recommended compression level shall be deflate=6
  • The Shuffling shall be True
  • The Fletcher32=True is strongly recommended

File Structure

The fill structure shall be:  

...

Expand
titleClick here to expand


sha256sum filename.nc > filename.sha25


File Naming Conventions

The filenames of the products in the C3S seasonal forecast are made following the CMIP5/6 and SPECS DRS elements, as described below. 

...

In addition, the output filename shall be constructed using a subset of metadata.

C3S Output Filename Conventions

The general filename formats for output files generated within C3S shall follow the below filename convention. All the elements are separated by underscores (“_”) and must appear in the following order:

...

lfpw_CERISE-SystemName-v20210101_hindcast_S2010110100_land_day_soil_mrlsl_r01i00p00.nc    (contribution to the CERISE project)

Global attributes

The following properties are intended to provide information about where the data came from and what has been done to it. This information is mainly for the benefit of human readers and data discovery mechanisms. The global attribute values are all character strings. When an attribute appears both globally and as a variable attribute, it is the variable’s version which has precedence.

...

Attribute Name

Value

Required 

Examples

Comment

ConventionsCF_convention_string  C3S-0.1 [Other convention] :...Mandatory"CF-1.11 C3S-0.3"

Multiple conventions may be included (separated by blank spaces)

title

Controlled vocabulary

<short institution name> seasonal forecast model output prepared for C3S"

For project use:

<short institution name> seasonal forecast model output prepared for CERISE project"

CF: Free text

ACDD (highly recommended)


Optional

"ECMWF seasonal forecast model output prepared for C3S"

"DWD seasonal forecast model output prepared for CERISE project"

A short phrase or sentence describing the dataset. In many discovery systems, the title will be displayed in the results list from a search, and therefore should be human readable and reasonable to display in a list of such names
<short institution name> is the first element of the comma-separated list of values of the corresponding "institution" attribute
references

Controlled vocabulary: 

URIs (such as a URL or DOI) for papers or other references. A valid doi is recommended

CF: Free text

Optional"doi:10.5194/gmd-8-1509-2015"
Published or web-based references that describe the data or methods used to produce it.
For a research project which is still under development, the attribute is optional. 
source

String contains the version of the model

<model_id>


Additional information for an advanced description of the model is high recommended.   

The following template should be followed in constructing the advanced string:

"<model_id> :  atmos: <model_name> (<technical_name>, <resolution_and_levels>); ocean: <model_name> (<technical_name>, <resolution_and_levels>); sea ice: <model_name> (<technical_name>); land: <model_name> (<technical_name>); coupler <model_name> (<technical_name>)''

Additional explanatory information may follow the required information.

NOTE that for some models, it may not make much sense to include all these components.

The first portion of the string, “model_id”, should be built using the following template:

"project-model_name-vYYYYMMDD" where YYYYMMDD is the release date of that version of the model (the date when it was first used)

project is used only for projects. For C3S, the operational service project is empty. 

Mandatory


"System8-v20210101:atmos ARPEGEv6.4.2(cy37t1,Tl359L137); ocean NEMOv3.6 (ORCA025 L75); sea-ice GELATOv6; land surface SURFEXv8.0; coupler OASIS MCT v3.0; river routing CTRIP"


"cerise-SystemName-v20240101:atmos ARPEGEv6.4.2(cy37t1,Tl359L137); ocean NEMOv3.6 (ORCA025 L75); sea-ice GELATOv6; land surface SURFEXv8.0; coupler OASIS MCT v3.0; river routing CTRIP"

The method of production of the original data. If it was model-generated, source should name the model and its version, as specifically as it could be useful.

It is a character string fully identifying the model and version used to generate the output. It should include information concerning the component models.

Note that information about changes in the individual components with respect to the "official" releases should be included (e.g. a different bathymetry)

The "source" attribute should include as much information as possible to not just identify the model but to brief the user about it.

For project-specific files the model_id should provide information about the project.

institute_id

Controlled Vocabulary:

"ecmf" for ECMWF
"egrr" for Met Office
"lfpw" for Météo-France
"edzw" for DWD
"cmcc" for CMCC
"kwbc" for NCEP
"rjtd" for JMA
"cwao" for ECCC
"ammc" for BoM

Mandatory

"edzw"Standardized 4 characters identifier of the institution that produced the data;
NOTE all the values come from abbreviations of WMO/GRIB "originating centre" table, except CMCC (not available there)
institution

Controlled Vocabulary:

"ECMWF, European Centre for Medium-Range Weather Forecasts, Reading, United Kingdom"

"Met Office, Exeter, United Kingdom"

"Météo-France, Toulouse, France"

"DWD, Deutscher Wetterdienst, Offenbach, Germany"

"CMCC, Centro Euro-Mediterraneo sui Cambiamenti Climatici, Bologna, Italy"

"NCEP National Centres for Environmental Prediction"

"JMA Japan Meteorological Agency"

"ECCC, Environment and Climate Change Canada, Montreal, QC, Canada"

"BOM, Australian Bureau of Meteorology, Melbourne, Australia"

CF: Free text


Optional (high recommended)

"Météo-France, Toulouse, France"

Specifies where the original data was produced. The name of the institution principally responsible for originating this data.

NOTE: The first element of the comma-separated list of values will be used as a shortened version of this attribute in some of the other global attributes ('summary', 'title')

contact

Controlled Vocabulary:

Copernicus User Support URI should be used
http://copernicus-support.ecmwf.int

CF: Free text

Optional

"http://copernicus-support.ecmwf.int"


optional for projects: "https://www.cerise-project.eu/"


project

Controlled Vocabulary:

"C3S Seasonal Forecast" or "<project>" should be used

CF: Free text


Mandatory


"C3S Seasonal Forecast"

"CERISE"

The attribute "project" is always mandatory, however, the value depends on the operational service or the project.  
creation_date

SPECS: YYYY-MM-DDThh:mm:ss<zone>

ISO 8601:2004 extended format



Mandatory

"2011-06-24T02:53:46Z"

The date on which this version of the data was created. Modification of values implies a new version, hence this would be assigned the date of the most recent values modification. Metadata changes are not considered when assigning the creation_date

NOTE: The ACDD 1.3 names this attribute as "date_create". The name "creation_date" has been used following SPECS convention.

commentFree textOptional
  • "Produced by University of Hamburg for DWD at ECMWF HPC facilities"
  • "Run by CMCC at CINECA"
Miscellaneous information about the data, not captured elsewhere.
forecast_type

Controlled Vocabulary

"forecast" or "hindcast" or "analysis"

Mandatory 

"forecast"

To identify the type of data



modeling_realm

Controlled Vocabullary

"atmos", "ocean", "land", "landIce", "seaIce", "aerosol", "atmosChem", "ocnBgchem"

Mandatory


"seaIce"

A string that indicates the high-level modelling component that is particularly relevant to the variable encoded
Controlled vocabulary taken from SPECS


Value depends on the variable (see "global attributes" column in variables tables)

frequency

Controlled Vocabulary

"mon", "day", "12hr", "6hr", "3hr", "fix"

Mandatory

"day"

A string indicating the interval between individual time-samples.
Controlled vocabulary extended from SPECS.

Value depends on the variable (see "global attributes" column in variables tables)

level_type

Controlled Vocabulary

"surface", "pressure", "soil", "ocean2d"

Mandatory

"pressure"

A string indicating the type of the level where the variable comes from

Value depends on the variable (see "global attributes" column in variables tables)

history

Controlled Vocabulary

Empty string

Optional""

To avoid this attribute being polluted by usual netCDF tools, it must be enforced to an empty string.


commit

timestamp + URL of a commit in a CVS repository

Optional

"2017-04-01T13:48:25Z https://git.ecmwf.int/projects/C3SS/repos/ecmf/System4_v20111101"

This attribute intends to keep trace of the tools/scripts used to post-process the data output from the model.

Ideally it should contain the link to a repository containing the specific set of tools and scripts needed to reproduce the same data from the model output. It is highly desirable to have that traceability information.

As a surrogate when the previous is not feasible it should include the timestamp followed by an URL pointing to the C3S documentation repository of the correspondent model version (properly labelled with the <model_id> introduced in 'source" attribute)

summary

Controlled Vocabulary:
"Seasonal Forecast data produced by <short institution name> as its contribution to the seasonal forecast activity of the Copernicus Climate Change Service (C3S). The data has global coverage with a 1-degree horizontal resolution and spans for around 6 months since the start date"

ACDD (highly recommended)

Optional 

"Seasonal Forecast data produced by DWD as its contribution to the seasonal forecast activity of the Copernicus Climate Change Service (C3S). The data has global coverage with a 1-degree horizontal resolution and spans for around 6 months since the start date"

Optional for projects: 

"Seasonal Forecast data produced by CMCC as its contribution to the CERISE project. The data has global coverage with a 1-degree horizontal resolution and spans for around 4 months since the start date"

A short paragraph describing the dataset


<short institution name> is the first element of the comma-separated list of values of the corresponding "institution" attribute

keywords

Fixed string

"Seasonal Forecasts, C3S, ECMWF, Copernicus, Climate Change, Climate Services, Earth Science Services, Environmental Advisories, Climate Advisories"

ACDD (highly recommended)

Optional


A comma separated list of key words and phrases.

NOTE: This attribute is likely to be modified in the future, once the contents of the Thesaurus for CDS faceting will be defined

forecast_reference_time

SPECS: YYYY-MM-DDThh:mm:ssZ

NOTE: This is ISO 8601:2004 extended format, but time zone is required to be UTC

Mandatory


"2011-06-01T00:00:00Z"

time of the analysis from which the forecast was made


Introduced as a global attribute to keep compatibility with SPECS
(note that works fine for SPECS data structure, i.e. one variable per start time per file)


For "forecast_type"="analysis" this global attribute must be removed

Spatial Coordinates

The table below describes all the requirements for the spatial coordinates.

...

Note about the horizontal coordinates: The regridding procedure to provide the data in the 1-degree grid must take into account that the full definition of the gird cells is given by the cell boundaries (lat_bnds, lon_bnds)

Discrete Axes

The table below describes all the requirements for the discrete axes. 

Type

Coordinate Name

Dimension Names

Axisstandard_namelong_name

units

boundsControlled vocabularyNote
charvegetation_typevtypeN/Aarea_typeN/AN/AN/A


The labelled axis is used to identify the vegetation type. The names should be chosen from the list of CF area types

C3S: string

 realizationstr31=31

 

E realizationrealization1N/A


members are not a physical quantity. Realization is a discrete coordinate and the members its categorical values (ordered or non-ordered ones)

SPECS approach:

rXXiYYpZZ

In the current version, the realization coordinate variable doesn't comply with the CF conventions. In future revisions the realization variable will become a discrete axis like the vegetation type

Time Coordinates

The table below describes the requirements for the Time Coordinates.

...

Warning
titlewarning

In the forecasts and hindcast dataleadtime" has been selected as a dimension (instead of "time") for both "time" and "leadtime". That means "leadtime" is the coordinate and "time" is an auxiliary coordinate. The main difference between "leadtime" and "time" is that time is a time stamp representing the valid time of the forecast, while "leadtime" is the interval of time between the forecast reference time and the valid time. 

  • This diverges from SPECS (where "time" was the name of the dimension and the coordinate, and "leadtime" was an auxiliary coordinate)
  • Here it has been done like that because
    1. both reftime and leadtime are the relevant (let's say "orthogonal") coordinates coming from the relationship time = reftime + leadtime
    2. doing like that has some advantages when merging netCDF files ("leadtime" can be easily shared by different variables in a merged file, while "time" cannot)

Cell boundaries

The table below describes the requirements for the Cell Boundaries in accordance with section 7.1 Cell Boundaries of CF convention.

...

Bounds NameDimensionsNote
time_bndsleadtime, bnds
leadtime_bnds
lat_bndslat, bnds

For C3S:

Values (1x1deg grid) prescribed:
[-90., -89.], [-89., -88.], ... [89., 90.]

lon_bndslon, bnds

For C3S:

Values (1x1deg grid) prescribed:

[0., 1.], [1., 2.], ... [359., 360.]

depth_bndsdepth, bndsShould define the full vertical extent of the soil model layers
depth_bndsdepth, bnds

For C3S:

Values prescribed (depth=300)

[0,300]

Grid mapping


Info

As described in section 5.6 Grid Mappings and Projections of CF convention. (see quote below)

When the coordinate variables for a horizontal grid are longitude and latitude, a grid mapping variable with "grid_mapping_name" of "latitude_longitude" may be used to specify the ellipsoid and prime meridian.

...

char hcrs ;
    hcrs:grid_mapping_name = "latitude_longitude" ;

Appendices

Appendix I. Data Variables

...