Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

File Formatting

  • The output files are written through the NetCDF API
  • The NETCDF4 _CLASSIC model will be adopted
  • Recommended compression level deflate=6
  • Shuffling=True
  • Fletcher32=True is strongly recommended

File Structure

  • Each netCDF4 file contains a single output variable (along with coordinate/grid variables, attributes and other metadata) from a single model and a single simulation (i.e., from a single ensemble member and a single start date)
  • Recommended maximum file size of 4GB
  • A file containing a hash created with sha256sum should be created for each file

    Code Block
    languagebash
    titleCreate hash files
    sha256sum filename.nc > filename.sha256


File Naming

<institute_id>_<model_id tag>_<forecast_type>_<start date identifier>_<modeling realm>_<frequency>_<level_type>_<variable name>_<ensemble member>.nc


<model_id_tag> as it is defined in the description of the "source" global attribute
<institute_id>, <forecast_type>, <modeling_realm>, <frequency> and <level_type> coming from the global attributes of the same name

<start_date_identifier> being an string "SYYYYMMDDHH"
<variable_name> from the netCDF name of the variable (short name)
<ensemble_member> from the 'realization' coordinate value


NOTE: The file name should be able to be rebuilt from the contents of the file

Metadata

Additional Questions to be addressed

...

Francisco Doblas-Reyes NetCDF4? With or without compression?
Kevin Marsh netCDF4 classic model (with deflate =6 suggested by Pierre-Antoine)

...

Kevin Marsh Pierre-Antoine Bretonniere proposed  follow SPECS convention

...

Kevin Marsh Pierre-Antoine Bretonniere suggested 4GB recommended maximum size

...

Kevin Marsh recommend 4GB Max Size for data files

...

Kevin Marsh DOI likely to be assigned at dataset level

...

Kevin Marsh DOI likely to be assigned at dataset level

...

Kevin Marsh  Antonio S. Cofino Gonzalez suggested follow cmip5 short names

...

Kevin Marsh follow cmip5 short names

...

Coordinate short names to be specified?

...

Kevin Marsh Antonio S. Cofino Gonzalez suggested  follow cmip5 coordinate short names

...

Kevin Marsh follow cmip5 coordinate short names

...

Kevin Marsh yes, but not in the initial convention release

...

Kevin Marsh Not considered in initial release

...

Kevin Marsh Antonio S. Cofino Gonzalez agreed 1 degree grid specified with valid max/min, but actual grid points not specified

...

Kevin Marsh 1 degree grid specified with valid max/min, but actual grid points not specified

...

Kevin Marsh These will be added by C3S, rather than data provider

...

Kevin Marsh These will be added by C3S

...

Kevin Marsh requested via standard name mailing list. Note that this process can take some considerable time.

...

Kevin Marsh requested via standard name mailing list

 

Discussion about time coordinates

NOTE: The SPECS approach (2 1D time coordinates) has been chosen for the "providers" convention

 

The encoding of multiple time coordinates requires particular consideration. An explicit example of the structure is given below.

Example of encoding data with multiple time axis informations

  
double forecast_reference_time(forecast_reference_time) ;
       forecast_reference_time:bounds = "forecast_reference_time_bnds" ;
       forecast_reference_time:units = "hours since 1970-01-01 00:00:00" ;
       forecast_reference_time:standard_name = "forecast_reference_time" ;
       forecast_reference_time:calendar = "gregorian" ;
double leadtime(leadtime) ;
       leadtime:bounds = "leadtime_bnds" ;
       leadtime:units = "hours" ;
       leadtime:standard_name = "forecast_period" ;
       leadtime:calendar = "gregorian" ;
double time(forecast_reference_time,leadtime) ;
       time:axis = "T" ;
       time:bounds = "time_bnds" ;
       time:units = "hours since 1970-01-01 00:00:00" ;
       time:standard_name = "time" ;
float temp(forecast_reference_time,leadtime,pressure,latitude,longitude);
      temp:units = "K";
      temp:standard_name = "air_temperature";
      temp:coordinates = "time";

Francisco Doblas-Reyes I interpret this as the time coordinates being a hypercube, where there could be missing data; this won't be consistent with the CMIP files; I still find this confusing unless a discussion about what to do with the missing data is undertaken.

Eduardo Penabad: Wouldn't that be solved by clarifying that different variables within the same file could potentially have different time coordinates/dimensions?

Francisco Doblas-Reyes Not sure. If to simplify you assume one variable only and this variable has in one file data for two start dates, one with three forecast time steps and another one with only two, the time dimensions will be forecast_reference_time=2, leadtime=3, but one of the values of temp() will have missing values, unless I haven't understood the model.

Antonio S. Cofino Gonzalez: discussion on multi-time dimension data

...