Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Introduction

This page defines the elements required to be present in the C3S seasonal netCDF files provided by the data providers.

The content of the table details the original variables requested in the ITT and is informed by the response of the data provided by the Met Office as part of the C3S seasonal ITT.

Other data providers  are likely to respond to the requested variables list in slightly different ways.

This work is also influenced by the ongoing work to store NEMO (CERA-20C) data in MARS and the design decisions made there regarding how to encode the NetCDF data.

These elements will form the basis for the Conventions checker which is being developed as part of Copernicus support.

In the following tables:

"red" cells(rows) and "Finalised" = "N" mean that that element has not been reviewed and approved by the EQC function.

"Green" cells(rows) and "Finalised" = "Y"  mean that the element is suitable for use as input to in the Checker.

Note that the 'Value' column gives several options, depending on what path is selected.

Checker

A checker will be supplied to check the compliance with:

  • required conventions (e.g. CF, C3S,...)
  • optional (recommended) conventions

Any other convention included won't be checked

Global attributes

 

FinalisedAttribute NameValueExamplesComment
YConventionsCF convention string  [Other convention] :..."CF-1.6"
"CF-1.6 C3S-0.1"

Files must pass the supplied checker

Multiple conventions may be included (separated by blank spaces)

Ytitle

CF: Free text

ACDD (highly recommended)

C3S: A controlled vocabulary will be provided

"IPSL-CM5A-LR model output prepared for CMIP5 RCP4.5"

Antonio S. Cofino Gonzalez: A short phrase or sentence describing the dataset. In many discovery systems, the title will be displayed in the results list from a search, and therefore should be human readable and reasonable to display in a list of such names.
This attribute is recommended by the NetCDF Users Guide and the CF conventions

YreferencesCF: Free text
C3S: A valid doi is recommended
"doi:10.5194/gmd-8-1509-2015"

references to more information

Antonio S. Cofino Gonzalez: Published or web-based references that describe the data or methods used to produce it. This attribute is recommended in the CF conventions.

Recommend values are URIs (such as a URL or DOI) for papers or other references.
The validity of the doi values will be checked. Warnings will be raised for any other content.

NsourceFree text
  • "model-generated, GloSea5-GC2"
  • "IPSL-CM5A-LR (2010) : atmos : LMDZ4 (LMDZ4_v5, 96x95x39); ocean : ORCA2 (NEMOV2_3, 2x2L31); seaIce : LIM2 (NEMOV2_3); ocnBgchem : PISCES (NEMOV2_3); land : ORCHIDEE (orchidee_1_9_4_AR5)"

Where the data is from

Antonio S. Cofino Gonzalez: The method of production of the original data. If it was model-generated, source should name the model and its version. If it is observational, source should characterize it. This attribute is defined in the CF Conventions. Examples: 'temperature from CTD #1234'; 'world model v.0.1'.

Francisco Doblas-Reyes I'm still missing information about the initialisation: what set of initial conditions was used, whether perturbations were used, etc

Eduardo Penabad: Can you (Francisco Doblas-Reyes) suggest how to do this?

Yinstitution

CF: Free text

C3S: A controlled vocabulary will be provided

"Met Office"

Francisco Doblas-Reyes I'm not convinced "Met Office" is descriptive enough. Shouldn't it be something like "C3S"?

Unknown User (brookshaw): "institution" is the origin of the data. Things like C3S will be specified elsewhere (e.g. project)

Who produced the data

Antonio S. Cofino Gonzalez: The name of the institution principally responsible for originating this data.
This attribute is recommended by the CF convention.

Ycontact

CF: Free text

C3S: Copernicus User Support URI should be used
http://copernicus-support.ecmwf.int

 

"http://copernicus-support.ecmwf.int"

Antonio S. Cofino Gonzalez: The CMIP5/SPECS DRS define this attribute. The ACDD defines: creator_name, creator_email, creator_url, publisher_name, publisher_email, publisher_url

Yproject

CF: Free text

C3S: "C3S Seasonal Forecast" is required, additional project info (e.g. ITT details) is optional

"C3S Seasonal Forecast"

Antonio S. Cofino Gonzalez: The name of the project(s) principally responsible for originating this data. Multiple projects can be separated by commas, as described under Attribute Content Guidelines. Examples: 'PATMOS-X', 'Extended Continental Shelf Project'.

Ycreation_date

SPECS: YYYY-MM-DDThh:mm:ss<zone>

NOTE: This is ISO 8601:2004 extended format

"2011-06-24T02:53:46Z"

Antonio S. Cofino Gonzalez: The date on which this version of the data was created. (Modification of values implies a new version, hence this would be assigned the date of the most recent values modification.) Metadata changes are not considered when assigning the creation_date. The ISO 8601:2004 extended date format is recommended.  

The ACDD 1.3 names this attribute as date_create
SPECS conventions will be followed
 

NcommentFree text"This 21th century simulation has been forced by prescribed concentration following the rcp 4.5 scenario."

Antonio S. Cofino Gonzalez: Miscellaneous information about the data, not captured elsewhere. This attribute is defined in the CF Conventions.

Yforecast_type

C3S: Controlled Vocabulary

"forecast" or "hindcast"

"forecast"To identify the type of data
NhistoryFree Text
  • "Produced using CDS Toolbox on 1/6/2016"
  • "Model raw output postprocessing with modelling environment (IMDI) at DKRZ: URL: http://svn-mad.zmaw.de/svn/mad/Model/IMDI/trunk, REV: 3436 2011-07-17T15:14:45Z CMOR rewrote data to comply with CF standards and CMIP5 requirements."

To record relevant information, such as the command history which led to this file being produced

Francisco Doblas-Reyes I understand that the versioning of the software used to create the data can be included either here or in the commit attribute. Any idea how/when to make the decision?

 

N

commit,

iso_lineage or lineage

Free text (ISO Lineage model 19115-2)"Produced using CDS Toolbox v1.0"

trace of the tools/scripts used

Antonio S. Cofino Gonzalez: We need a more implementtios examples on this. This could achiived in EQC WP where metadata is been part of their activities (i.e. WP4@QA4SEAS). ISO 19115-2 defines a linage model where this is been considered. TBD.

YsummaryACDD (highly recommended): Text, defined phrase
C3S: The content will be provided
 A short paragraph describing the dataset
Ykeywords

ACDD (highly recommended) : text, controlled vocabulary

C3S: The content will be provided

 A comma separated list of key words and phrases.
Yforecast_reference_time

SPECS: YYYY-MM-DDThh:mm:ssZ

NOTE: This is ISO 8601:2004 extended format

"2011-06-01T00:00:00Z"

time of the analysis from which the forecast was made

UTC zone is required

Question: Do we need controlled vocabularies for the above ?

Spatial Coordinates

FinalisedType
CMIP5
Coordinate Name
(CMIP5)
Dimension Names
(CMIP5)
Axisstandard_namelong_name
(CMIP5)
units
(CF canonical units)
positivevalid_min
(CMIP5)
valid_max
(CMIP5)
Notes
YdoublelatlatYlatitudelatitudedegrees_northN/A-90.90.Bounds required
Values (1x1deg grid) required
YdoublelonlonXlongitudelongitudedegrees_eastN/A0.360.Bounds required
Values (1x1deg grid) required
YdoubleplevplevZair_pressurepressure

Pa

downN/AN/AThis is also referred to as isobaric level by some tools
Value (plevels) required (ITT)
YdoubledepthdepthZdepthdepth

m

downN/AN/AUsed for soil model levels
NOTE: Number and depth of levels is not prescribed by C3S
YdoubleheightheightZheightheightmup

CMIP5:

2mtemp: 1.
10mu/v: 1.

CMIP5:

2mtemp: 10.
10mu/v: 30.

can be used for single level (height, soil)

e.g. 2 m (for Temperature)

N

C3S: string

 

ISSUE:

Then, it needs an auxiliary coordinate
(string len)

 realization

C3S: realization_dim

CF: a different name is needed for dim/variable
(see comment)


N/A realizationrealization1N/AN/AN/Amembers are not a physical quantity. Realization is Discrete Axis and the mebers it categorical values (ordered or non-ordered ones)

NOTE:

A dimension named "bounds" is also required for 'extensive' quantities.

Time Coordinates

 FinalisedCoordinate NameDimension NamesAxisstandard_namelong_name
(SPECS)
calendarunitspositiveNotes
YleadtimetimeN/Aforecast_period"Time elapsed since the start of the forecast"N/A

SPECS: days
C3S: requested units can be relaxed to equivalent time units

N/A

 

Ytime time T

time

"Verification time of the forecast"gregorian

SPECS: "days since 1850-01-01"

C3S: requested units can be relaxed to equivalent time units

N/A

Valid time of data


 

Coordinate Bounds

FinalisedBounds NameDimensions

Values

Notes
Ntime_boundstime,bounds 

dimension bounds = 2

Maybe have "bounds" in a separate table/comment, explaining that they have the same units as the variable "bounded" by them. In addition to that I would find clearer if the time variable value is always at the end of the correspondent time_bounds as this works well both for instantaneous and accumulated/aggregated variables (e.g. time=20160922 06, timebounds = [20160922 00, 20160922 06])

for 24h freqs.

2 values with the same units as "time" coordinate
[0,24]

intervals must represent 24 hours

starting at 0Z
(is this a convention? WMO?)

Nlat_boundslat, bounds dimension bounds = 2 for grid
Nlon_boundslon, bounds dimension bounds = 2 for grid



Ensemble coordinate

TBD

Pressure level coordinates

These are specified as being:

925, 850, 700, 500, 400, 300, 200, 100, 50, 30 and 10 hPa

Invariant Fields

 FinalisedPriority (i.e. should be defined first for MARS)requested variablesOur Convention (in netcdf files)

step

Parameter Identifier (as used in ITT)
Originating Centre

name

standard_nameunits (as used in ITT)comments
 N20 h17298land-sea mask

land_area_fraction

0-1 
 N20 h12998geopotentialsurface_altitudem2 s-2 

 

Surface Fields (defined at a given height level)

Finalised

Priority

(i.e. should be defined first for MARS)

requested variables
(ITT table)
C3S NetCDF Convention

step

Parameter Identifier )
Originating Centre

name

name
(CMIP5)
standard_nameunitsdimensionsCell MethodsCoordinates

Coordinate

Bounds

comments
Y16 h inst167982m temperaturetasair_temperatureKtime,lat,lon

"time: point"

CF: recommended
C3S: required

"height"

 

 

 

C3S: Just 2m and 1.5m will be valid values for the height coordinate of this variable

CF: If the variable is instantaneous it shouldn't have time_bounds

Y124 h inst.5198max 2m temperature (last 24h)tasmaxair_temperatureKtime,lat,lon

"time: maximum (interval: value unit)"

CF: interval is optional
C3S: interval is required. (Units in UDUNITS)

"height"

time_bounds

C3S: Just 2m and 1.5m will be valid values for the height coordinate of this variable

C3S: The interval is required to have a value<=3 hours)

Y124 h inst.5298min 2m temperature  (last 24h)tasminair_temperatureKtime,lat,lon

"time: maximum (interval: value unit)"

CF: interval is optional
C3S: interval is required. (Units in UDUNITS)

"height"

 

C3S: Just 2m and 1.5m will be valid values for the height coordinate of this variable

C3S: The interval is required to have a value<=3 hours)

N26 h inst168982m dewpoint temperature dew_point_temperatureK  

scalar

value=2

unit=m

 

MetOffice's temperatures are at 1.5m

N2

6 h

inst

1659810m U wind component x_windm s-1  

scalar

value=10

unit=m

  
N2

6 h

inst

1669810m V wind component y_windm s-1  

scalar

value=10

unit=m

  
N224 h inst499810m max wind gust wind_speedm s-1  

scalar

value=10

unit=m

  
              

 

Surface Fields (not defined at a  height level)

FinalisedPriority (i.e. should be defined first for MARS)requested variablesOur Convention (in netcdf files)

step

Parameter Identifier (as used in ITT)
Originating Centre

name

standard_nameunits (as used in ITT)Cell Methodstime_boundscomments
N16 h inst15198mean sea level pressureair_pressure_at_sea_levelPa 

intervals must represent 6 hours


 
N26 h inst16498total cloud covercloud_area_fraction_assuming_maximum_random_overlap1 

intervals must represent 6 hours


 
N26 h inst23598skin temperaturesurface_temperatureK 

intervals must represent 6 hours


skin_temperature" doesn't exist as a CF standard_name, so maybe the required one should be "surface_temperature"

N224 h inst3198sea-ice coversea_ice_area_fraction1 Intervals must represent 24 hours

starting at 0Z (to be agreed)

 
N124 h inst3498sea surface temperatureopen_sea_surface_temperatureK Intervals must represent 24 hours

starting at 0Z (to be agreed)

To cope with the fact that some providers send instantaneous 00UTC values and some others daily averages, it was agreed as a compromise to request 6h instantaneous SST values (so the value at 00h would be the same for everyone, and a daily average to account for the diurnal cycle could be obtained from the 6h values)

N224 h inst14198snow depth (water equivalent)

lwe_thickness_of_surface_snow_amount

m Intervals must represent 24 hours

starting at 0Z (to be agreed)

Note it is snow amount, not snowfall amount.
A check is needed whether this should be an average or an instantaneous value.

N224 h inst3398snow densitysnow_densitykg m-3 Intervals must represent 24 hours

starting at 0Z (to be agreed)

A check is needed whether this should be an average or an instantaneous value

N224 h inst24398forecast albedosurface_albedo1 Intervals must represent 24 hours

starting at 0Z (to be agreed)

don't know how the cell_methods could be coded for this variable if it is obtained from ratios of daily accumulations of shortwave radiation.

 

Soil Level Fields

FinalisedPriority (i.e. should be defined first for MARS)requested variablesOur Convention (in netcdf files)

step

Parameter Identifier (as used in ITT)
Originating Centre

name

standard_nameunits (as used in ITT)Cell Methodstime_bounds

soil model layer(level)

number

comments
N124 h inst3998volum. soil moisture layer 1moisture_content_of_soil_layerm3 m-3 

intervals must represent 24 hours

starting at 0Z (to be agreed)

scalar

value=1

 

the number of soil levels shouldn't be prescribed (they will likely differ from model to model) and the vertical coordinate for them should be the height -need bounds- of each layer. In addition, there was an issue with the units, Anca should have the final conclusion about that.

N124 h inst4098volum. soil moisture layer 2moisture_content_of_soil_layerm3 m-3 intervals must represent 24 hours

starting at 0Z (to be agreed)

scalar

value=2

see above
N224 h inst4198volum. soil moisture layer 3moisture_content_of_soil_layerm3 m-3 intervals must represent 24 hours

starting at 0Z (to be agreed)

scalar

value=3

see above
N224 h inst4298volum. soil moisture layer 4moisture_content_of_soil_layerm3 m-3 intervals must represent 24 hours

starting at 0Z (to be agreed)

scalar

value=4

see above
N224 h inst13998soil temperature level 1soil_temperatureK intervals must represent 24 hours

starting at 0Z (to be agreed)

scalar

value=1

 

Accumulation Fields

FinalisedPriority (i.e. should be defined first for MARS)requested variablesOur Convention (in netcdf files)

step

Parameter Identifier (as used in ITT)
ParamIDOriginating Centre

name

standard_nameunits (as used in ITT)Cell Methodstime_boundscomments
N124 h228 98total precipitationprecipitation_amountmtime: sum (interval: 1hour)

intervals must represent 24 hours

starting at 0Z (to be agreed)

is the "interval" is needed in cell_methods with "time: sum"?

N224 h144 98snowfall

lwe_thickness_of_snowfall_amount

mtime: sum (interval: 1hour)

intervals must represent 24 hours

starting at 0Z (to be agreed)

as above
N224 h146 98surface sensible heat flux

surface_upward_sensible_heat_flux

 

-may request integral_of_XXXXX_wrt_time instead (diff units)
NOTE: this one is not in the CF standard names list , but the "downward" version is

J m-2TBD

intervals must represent 24 hours

starting at 0Z (to be agreed)

If we are not going to request accumulations since the beginning of the forecast, maybe it is more natural for the providers to send daily averaged values (which affects the standard name -integral_of_XXXXX_wrt_time- and hence the units)

N224 h147 98surface latent heat flux

surface_upward_latent_heat_flux

-may request integral_of_XXXXX_wrt_time instead (diff units)
NOTE: this one is not in the CF standard names list, but the "downward" version is

J m-2 TBD

intervals must represent 24 hours

starting at 0Z (to be agreed)

as above
N224 h169 98surface solar radiation downwards

surface_downwelling_shortwave_flux_in_air

-may request integral_of_XXXXX_wrt_time instead (diff units)

J m-2 TBD

intervals must represent 24 hours

starting at 0Z (to be agreed)

as above
N224 h175 98surface thermal radiation downwards

surface_downwelling_longwave_flux

-may request integral_of_XXXXX_wrt_time instead (diff units)

 

J m-2 TBD

intervals must represent 24 hours

starting at 0Z (to be agreed)

as above
N224 h176 98surface net solar radiation

surface_net_downward_shortwave_flux

-may request integral_of_XXXXX_wrt_time instead (diff units)

 

J m-2 TBD

intervals must represent 24 hours

starting at 0Z (to be agreed)

as above
N224 h177 98surface net thermal radiation

surface_net_downward_longwave_flux

-may request integral_of_XXXXX_wrt_time instead (diff units)

J m-2 TBD

intervals must represent 24 hours

starting at 0Z (to be agreed)

as above
N224 h178 98top solar radiation

toa_incoming_shortwave_flux

-may request integral_of_XXXXX_wrt_time instead (diff units)
NOTE: this one is not in the CF standard names list

J m-2 TBD

intervals must represent 24 hours

starting at 0Z (to be agreed)

as above
N224 h179 98top thermal radiation

toa_outgoing_longwave_flux

-may request integral_of_XXXXX_wrt_time instead (diff units)

J m-2 TBD

intervals must represent 24 hours

starting at 0Z (to be agreed)

as above
N224 h180 98east-west surface stress

surface_downward_eastward_stress

-may request integral_of_XXXXX_wrt_time instead (diff units)

(N m-2) s TBD

intervals must represent 24 hours

starting at 0Z (to be agreed)

as above
N224 h181 98north-south surface stress

surface_downward_northward_stress

-may request integral_of_XXXXX_wrt_time instead (diff units)

(N m-2) sTBD

 

intervals must represent 24 hours

starting at 0Z (to be agreed)

as above
N224 h182 98evaporation

lwe_thickness_of_water_evaporation_amount

mtime: sum (interval: 1hour)intervals must represent 24 hours

starting at 0Z (to be agreed)

 
N224h205 98runoffrunoff_amountmtime: sum (interval: 1hour)intervals must represent 24 hours

starting at 0Z (to be agreed)

 

N224 h8 98surface runoffsurface_runoff_amountmtime: sum (interval: 1hour)intervals must represent 24 hours

starting at 0Z (to be agreed)

 
N224 h9 98sub-surface runoffsubsurface_runoff_amountmtime: sum (interval: 1hour)intervals must represent 24 hours

starting at 0Z (to be agreed)

 

 

Pressure Level Fields

 

 FinalisedPriority (i.e. should be defined first for MARS)requested variablesOur Convention (in netcdf files)

step

Parameter Identifier (as used in ITT)
Originating Centre

name

standard_nameunits (as used in ITT)Cell Methodstime_boundscomments
 N112 h inst12998geopotentialgeopotentialm2/s2 

intervals must represent 12 hours


Alternative is "geopotential_height" in m

 N212 h inst13098 temperatureair_temperatureK 

intervals must represent 12 hours


 
 N212 h inst13398specific humidityspecific_humidity1 intervals must represent 12 hours


 
 N212 h inst 13198U component of wind x_windm/s  intervals must represent 12 hours


 
 N212 h inst 13298V component of wind y_wind m/s  intervals must represent 12 hours


 

 

Additional Questions to be addressed

QuestionDiscussionDecision
File format to be used?
Francisco Doblas-Reyes NetCDF4? With or without compression?
Kevin Marsh netCDF4 classic model (with deflate =6 suggested by Pierre-Antoine)
 
File naming,

Kevin Marsh Pierre-Antoine Bretonniere proposed  follow SPECS convention

 
forecast/hindcast matching and labelling  
File size recommendation (maximum size)?

Kevin Marsh Pierre-Antoine Bretonniere suggested 4GB recommended maximum size

Kevin Marsh recommend 4GB Max Size for data files

Versioning of data files?  
DOI

Kevin Marsh DOI likely to be assigned at dataset level

Kevin Marsh DOI likely to be assigned at dataset level

Variable short names to be specified?

Kevin Marsh  Antonio S. Cofino Gonzalez suggested follow cmip5 short names

Kevin Marsh follow cmip5 short names

Coordinate short names to be specified?

Kevin Marsh Antonio S. Cofino Gonzalez suggested  follow cmip5 coordinate short names

Kevin Marsh follow cmip5 coordinate short names

Extension to include ocean data for C3S?

Kevin Marsh yes, but not in the initial convention release

Kevin Marsh Not considered in initial release

Grids, resolution etc to be specified?

Kevin Marsh Antonio S. Cofino Gonzalez agreed 1 degree grid specified with valid max/min, but actual grid points not specified

Kevin Marsh 1 degree grid specified with valid max/min, but actual grid points not specified

MARS attributes to be specified?

Kevin Marsh These will be added by C3S, rather than data provider

Kevin Marsh These will be added by C3S

 standard name request/assignment process?

Kevin Marsh requested via standard name mailing list. Note that this process can take some considerable time.

Kevin Marsh requested via standard name mailing list

 

Discussion about time coordinates

NOTE: The SPECS approach (2 1D time coordinates) has been chosen for the "providers" convention

...

The encoding of multiple time coordinates requires particular consideration. An explicit example of the structure is given below.

Example of encoding data with multiple time axis informations

  
double forecast_reference_time(forecast_reference_time) ;
       forecast_reference_time:bounds = "forecast_reference_time_bnds" ;
       forecast_reference_time:units = "hours since 1970-01-01 00:00:00" ;
       forecast_reference_time:standard_name = "forecast_reference_time" ;
       forecast_reference_time:calendar = "gregorian" ;

...