Below is a list of up-to-now identified possible technical solutions for archiving of time-series datasets. Any new proposals or links to subject matter experts are still welcome.

Currently the NetCDF format has been chosen as the first candidate for the storage of the TIGGE time-series data. The related technical work (choosing the exact file structure; defining meta data; writing the maintenance scripts etc.) is in progress.

 

Relational database (e.g. PostgreSQL)

pros:

  • flexible
  • conditional verification
  • SQL queries
  • easy to implement

cons:

  • possibly very slow and hard to maintain as an estimated annual increases is around 100 TB
  • probably a need for a tool preparing some common output data file for users from the database values

ODB

pros:

  • full support at ECMWF
  • can be stored in ECMWF MARS and use all its capability
  • SQL like queries
  • compression

cons:

  • generally not too known format?
  • not too flexible if any re-computations are needed

NetCDF

pros:

  • well known format
  • compression

cons:

  • not ECMWF MARS support
  • not too flexible if any re-computations are needed

BUFR

pros:

  • fast access
  • full support at ECMWF - new API in preparation
  • can be stored in ECMWF MARS and use all its capability (?)
  • compression

cons:

  • special setup probably required if there would be any data policy for external users (one file cannot contain all information)
  • not too flexible if any re-computations are needed

Other possibilities

  • some sort of GRIB?



  • No labels