You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 37 Next »

At ECMWF the default file format for data distribution and the only supported format is the GRIB format. Data can be accessed in NetCDF format, but NetCDF is not formally supported by ECMWF.

For this reason, before downloading data check if your processing software supports the GRIB format. If it does, use the GRIB format. If your software supports only NetCDF, use the NetCDF format.

Basics

NetCDF (Network Common Data Form) is a set of software libraries and self-describing, machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data. NetCDF is commonly used to store and distribute scientific data. The NetCDF software was developed at the Unidata Program Center in Boulder, Colorado, USA (Unidata NetCDF Factsheet; Also see Wikipedia article). NetCDF files usually have the extension .nc. To read NetCDF files there are graphical tools like Matlab, IDL, ArcGIS, NCView, Xconv and developer tools like the Unidata NetCDF4 module for Python and Xarray.

For climate and forecast data stored in NetCDF format there are (non-mandatory) conventions on metadata (CF Convention). CF compliant metadata in NetCDF files can be interpreted by Metview, NCView, Xconv and other tools.

The latest version of the NetCDF format is NetCDF 4 (aka NetCDF enhanced, introduced in 2008), but NetCDF 3 (NetCDF classic) is also still widely used.

NetCDF files can be converted to ASCII or text or even to MS Excel using e.g. netcdf4excel.

Writing your own NetCDF decoder or encoder

To decode NetCDF files there is an official NetCDF Application Programming Interface (API) with interfaces in Fortran, C, C++, and Java available from Unidata. The API also comes with some useful command-line tools (e.g. ncdump -h file.nc gives a nice summary of file contents - see ncdump guide).

For writing NetCDF files, please check through Unidata 6 Best Practices (6.8 Packed Data Values and 6.9 Missing Data Values are of particular interest).

Scale_factor and Add_offset

The Scale_factor and Add_offset attributes in NetCDF files are a mechanism to reduce the storage space needed for NetCDF files, so essentially a data packing mechanism. So the data in the netCDF files have been packed into short integers (16 bits or NC_SHORT) to save space. Each netCDF variable that has been packed in this way has an 'add_offset' and 'scale_factor' attribute associated with it.

When reading and writing NetCDF files software applications compliant with Unidata specifications should deal with Scale_factor and Add_offset automatically, making unpacking (read) and packing (write) completely transparent to the user. This means the user always sees the unpacked data values and doesn't have to deal with Scale_factor and Add_offset. The software application might display the values of Scale_factor and Add_offset for reference, similar to a ZIP compression software displaying the compression factor.
For example:

  • Matlab (ncread, ncwrite) applies Scale_factor and Add_offset automatically
  • Panoply applies Scale_factor and Add_offset automatically. It also displays the values of Scale_factor and Add_offset, causing many users to believe they have to calculate anything - no, you don't.
  • Metview from version 5 onwards automatically applies Scale_factor and Add_offset, Metview 4.x does not.
  • The Unidata NetCDF4 module for Python (which is an interface to the NetCDF C library) applies Scale_factor and Add_offset automatically

The above is how application software should be implemented, i.e. to show unpacked data values. Some software applications might be implemented differently and display the packed data values. In this case the user has to calculate the unpacked values using Scale_factor and Add_offset, using these formulae:

  • unpacked_data_value = packed_data_value * scale_factor + add_offset
  • packed_data_value = nint((unpacked_data_value - add_offset) / scale_factor)

In any case we recommend you check your processing software's documentation on how it deals with Scale_factor and Add_offset.

  • No labels