You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 4 Next »

Contributors: Ole Einar Tveito (MET Norway) and Cristian Lussana (MET Norway)

Issued by: Ole Einar Tveito (MET Norway) and Cristian Lussana (MET Norway)

Issued Date: 31/03/2022

Ref: M311_Lot3.3.1.2_NGCD_PUG_ver2.1

Official reference number service contract:  C3S2 311 Lot3

Table of Contents

History of modifications

Version

Date

Description

1.0

08/12/2021

First version

2.131/03/2022Update to version 22.03

List of datasets covered by this document

Deliverable ID

Product title

Product type

Version Number

Delivery date

M311_Lot3.3.1.2-2021/10

NGCDObservational gridded dataset21.0331/03/2021
M311_Lot3.3.1.2-2022/03NGCDObservational gridded dataset22.0331/03/2022

Related documents

Reference IDDocument

D1

NGCD Algorithm Theoretical Basis Document

D2

Climate and Forecast (CF) Conventions and Metadata; http://cfconventions.org

D3

R Core Team (2017). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL

https://www.R-project.org/

Acronyms 

Acronym

Definition

CDS

Climate Data Store

MET Norway

The Norwegian Meteorological Institute

FMI

Finnish Meteorological Institute

KNMI

The Royal Netherlands Meteorological Institute

SMHI

The Swedish Meteorological and Hydrological Institute

NGCD

Nordic Gridded Climate Dataset (DOI:https://doi.org/10.24381/cds.e8f4a10c)

NGCD-1

NGCD type 1 datasets

NGCD-2

NGCD type 2 datasets

seNorge

Observational gridded dataset over Norway (senorge.no)

ECA&D

European Climate Assessment & Dataset

TITAN

software for automatic quality control (TITAN and titanlib)

OI

Optimal Interpolation

RMSERoot Mean Squared Error

General definitions

Symbol

Definition

TG

Daily mean temperature (from day before the date in the timestamp at 06 UTC, to date in the timestamp at 06 UTC)

TX

Daily maximum temperature (from day before the date in the timestamp at 18 UTC, to date in the timestamp at 18 UTC)

TN

Daily minimum temperature (from day before the date in the timestamp at 18 UTC, to date in the timestamp at 18 UTC)

RR

Daily precipitation total (from day before the date in the timestamp at 06 UTC, to date in the timestamp at 06 UTC)

Data access information

DescriptionLink

The historical archive for different versions is made available to users via the CDS

https://cds.climate.copernicus.eu/cdsapp#!/dataset/insitu-gridded-observations-nordic
The data are also available to users via MET Norway OPeNDAP accesshttps://thredds.met.no/thredds/catalog/ngcd/catalog.html
MET Norway. Historical archive ver. 22.03 (for different versions, replace 22.03 with the correct label)

https://thredds.met.no/thredds/catalog/ngcd/version_22.03/catalog.html

MET Norway. Provisional archivehttps://thredds.met.no/thredds/catalog/ngcd/provisional/catalog.html

The list of Known issues is available at the following link.

Scope of the document

This document is the user guide for the NGCD observational gridded dataset produced under the service contract C3S2_311 Lot3 (Collection and processing of in situ observations - Access to high-resolution gridded datasets over Europe based on in situ observations) on behalf of Copernicus.

The main aim of this document is to aid the user in understanding the features and limitations of the data, and then to enable them to read and use the data.

Executive summary

NGCD is an observational gridded dataset covering Fennoscandia (Finland, Norway and Sweden) based on in-situ observations only. The variables included in the dataset are (see General definitions): daily mean temperature, TG; daily maximum temperature, TX; daily minimum temperature, TN; and daily total precipitation, RR.

NGCD consists of two independent datasets, NGCD-1 and NGCD-2, derived using different spatial interpolation methods applied to the same input observation dataset. The description of the input data and the methods is available in the NGCD Algorithm Theoretical Basis Document (Nordic Gridded Climate Dataset (NGCD): Product User Guide and Specification (PUGS)). NGCD programs are written using R-language Nordic Gridded Climate Dataset (NGCD): Product User Guide and Specification (PUGS). The NGCD algorithms and scripts are available at github.com/metno/NGCD.

The products are provided on a regular grid, using a Lambert Azimuthal Equal Area coordinate reference system, with a spacing of 1 km in both Easting and Northing directions. For each day, 8 fields are provided: 4 with NGCD-1 methods (i.e. one for each variable) and 4 with NGCD-2 methods. Each field is stored in a separate file and the data files are in netCDF-4 format.

NGCD is fully updated twice a year, in March and September. Each update yields a new version, which is labelled as Year.Month (e.g. the update in March 2022 yields ver. 22.03). This document has been updated to NGCD ver. 22.03.

Each version is made up of two archives: i) the historical archive and ii) the provisional archive.

In the case of ver. 22.03:

  • The historical archive covers the 51-year period that ranges from 1971 to 2021. Any changes made in post-production on the historical archive are reported on the List of Known issues and/or in the "Known issues" section on the NGCD page on the MET Norway thredds server.
  • The provisional archive includes NGCD-2 files only. It begins with files of the 1st of January 2022 and is regularly updated every day, such that some of the files (i.e. usually the most recent ones) may change from day to day, without any particular warnings. The products are obtained using the same methods as for the historical archive of NGCD-2. However, the observations used as input data are retrieved from the open data application programming interfaces of: FMI, MET Norway and SMHI. The provisional archive for the period from January to June 2022 will be replaced by the historical archive of the next NGCD version.

The data can be found at the links specified in the Data access information. The main description of NGCD is in the Product information section. Then, Appendix A reports the evaluation made for ver. 18.03, which shares the same methods used for ver. 22.03. Appendix B contains examples of the file structures.

Product information

Product description

The description of the input data and the methods is available in the NGCD Algorithm Theoretical Basis Document [D1]. As specified in Nordic Gridded Climate Dataset (NGCD): Product User Guide and Specification (PUGS), the user must be aware that the NGCD input data are non-homogenized time series.

The time series of the number of stations used for the production of NGCD ver 22.03 are shown in Figures 1-4 for RR, TG, TN and TX, respectively. The number of RR stations in the region decreases, from 2400 stations-per-day in 1971 to 1500 stations-per-day in 2020 (≅ -60%). For TG, the situation is the opposite and the number of stations grows after 2010, from 800 stations-per-day in 2000 to 1300 stations-per-day in 2020 (≅ +60%). The main reason for the increase is the inclusion of sub-regional networks over Norway, managed by Norwegian public institutions. Note that the number of stations used in the production of TG after 2010 has a larger day-to-day variability than before. TX and TN undergo a gradual decrease in the number of stations from 1971 to 1994, which is followed by a turnaround with gradual growth from 1995 onwards. Then, for 2021 the input datasets align -rather abruptly- with that used for TG. Between 1971 and 2020, the relative variations for TX and TN are more limited than for TG, with the minimum number around 700 stations-per-day and the maximum 950 stations-per-day (≅ +35%). In 2021, there are more stations available than in 2020, for all variables.

Figure 1: Daily precipitation total (RR): monthly time series of the number of stations used in the production of NGCD ver 22.03 from January 1971 to December 2021. For each month, the number of stations shown is the median of the stations available daily.

Figure 2: Daily mean temperature (TG): monthly time series of the number of stations used in the production of NGCD ver 22.03 from January 1971 to December 2021. For each month, the number of stations shown is the median of the stations available daily.

Figure 3: Daily minimum temperature (TN): monthly time series of the number of stations used in the production of NGCD ver 22.03 from January 1971 to December 2021. For each month, the number of stations shown is the median of the stations available daily.

Figure 4: Daily maximum temperature (TX): monthly time series of the number of stations used in the production of NGCD ver 22.03 from January 1971 to December 2021. For each month, the number of stations shown is the median of the stations available daily.

The spatial distribution of the observing stations over the domain is shown in Figures 5-7 for RR, TG and TN, respectively. The distribution for TX is similar to that for TN and therefore is not shown. For RR and TG, the two panels on the top row show the situation when the number of stations is close to the minimum available within the period (“sparse” observational network), while the two panels on the bottom row show the opposite situation, when the maximum number of stations is used. For TN, in Figure 7, the stations are shown for 2020 (“sparse”) and 2021 (“dense”). The panels in the left columns show the map with the distribution of stations over the domain. In the right columns, the panels are used to display the observational coverage as a function of elevation.

For RR, the number of stations decreases over the years and the impact is clearly visible both in the map and, especially, over the range of elevations not covered properly by the observational network. In the north, more than half of the elevation range (i.e. the higher elevations) is not covered by the network. In the case of TG, the increase in the number of stations is concentrated over Norway and the range of elevations is more uniformly sampled than for RR. For TN (and TX), the situation is similar to TG, except that until 2021 the observational network over Norway is less dense. In 2021, the observational network is similar for all temperature variables.

Figure 5: Daily precipitation total (RR): spatial distribution of the observing stations used in the production of NGCD when the observational network consists of a smaller number of stations (“sparse” observational network, top row) and a larger number of stations (“dense” observational network, bottom row) with respect to the the time series of available observations (see Figure 1). The left column shows maps over the domain while the right column shows the elevations of the stations (blue dots) as a function of their Northing coordinates. As a reference in the background, the gray dots are the elevations of the cells on the 1 km digital elevation model over Fennoscandia

Figure 6: Daily mean temperature (TG): spatial distribution of the observing stations used in the production of NGCD when the observational network consists of a smaller number of stations (“sparse” observational network, top row) and a larger number of stations (“dense” observational network, bottom row) with respect to the the time series of available observations (see Figure 2). The left column shows maps over the domain while the right column shows the elevations of the stations (red dots) as a function of their Northing coordinates. As a reference in the background, the gray dots are the elevations of the cells on the 1 km digital elevation model over Fennoscandia.

Figure 7: Daily minimum temperature (TN): spatial distribution of the observing stations used in the production of NGCD when the observational network consists of a smaller number of stations (“sparse” observational network, top row) and a larger number of stations (“dense” observational network, bottom row) with respect to the the time series of available observations (see Figure 3). The left column shows maps over the domain while the right column shows the elevations of the stations (green dots) as a function of their Northing coordinates. As a reference in the background, the gray dots are the elevations of the cells on the 1 km digital elevation model over Fennoscandia.

Data usage information

File naming convention

The data files are in netCDF-4 format and follow the CF-standards, see Nordic Gridded Climate Dataset (NGCD): Product User Guide and Specification (PUGS) in Section Related Documents.

For the historical archive, the file names have the format:

NGCD_<Var>_type<Id_type>_version_<ver>_<Date>.nc

Where:

  • <Var> is one of: RR, TG, TX and TN
  • <Id_type> is either 1 or 2
  • <ver> is the version label in the format Year.Month (e.g. 22.03)
  • <Date> is in the form YYYYMMDD

For the provisional archive, the file names have the format:

NGCD_<Var>_type2_version_<ver>_prov_<Date>.nc

where the meaning of the variable parts inside the symbols <...> is the same as for the historical archive. It is worth remarking that: i) the provisional files are made for NGCD-2 only; ii) there is reference to a version, which indicates the methodology used in the data production, though the source of the observations is different to that in the historical archive of the same version.

Data format

The key fields provided in this product are as given in Table 1.

Table 1: Key data fields in the output files.

Variable Name

Description

lon

longitudes of the grid points

lat

latitudes of the grid points

projection_laea

specification of the coordinate reference system

time_bounds

time bounds of the aggregated variable

TG / TX / TN / RR

daily variable in the file

The data are provided on a single layer, near the surface, and on a regular grid covering Finland, Norway and Sweden. The grid is masked outside the domain and over the sea, where no in-situ observations are available, apart from a buffer extending over the sea for a few kilometers. The coordinate reference system is the Lambert Azimuthal Equal Area projection and the grid has a resolution of 1 km in both the Easting and Northing directions. The dimension of the data field is 1550 in the Easting and 2020 in the Northing. The spatial domain is shown in Figures 5-7.

When downloading files from the CDS, users obtain one file for each: day, variable and NGCD-type requested. The fields also have a time dimension, which always has a length of one.

Product content examples

Examples of NGCD products on two generic days are shown in Figures 8-10.

The RR fields for 10 January 2021 are shown in Figure 8. NGCD-1 is based on triangulation and the precipitation is adjusted for local effects in mountainous regions, based on elevation. NGCD-2 reconstructs a more continuous precipitation field than NGCD-1, without elevation adjustments, so its RR fields look generally smoother. For both types, the values in data-sparse regions are representative of larger-scale precipitation than those in data-dense regions, where the reconstructed field variability is usually higher.

Figure 8: Daily precipitation totals (RR, mm) for 10 January 2021: NGCD-1 on the left; NGCD-2 on the right.

Figures 9-10 show TG and TN, TX for a spring day, 30 May 2021.


Figure 9: Daily mean temperature (TG, oC) for 30 May 2021: NGCD-1 on the left; NGCD-2 on the right.



Figure 10: Daily minimum and maximum temperatures (TN top row, TX bottom row, oC) for 30 May 2021: NGCD-1 in the left column; NGCD-2 in the right column.

Data usage acknowledgments

All users of NGCD must provide clear and visible attribution to the Copernicus programme and are asked to cite and reference the dataset provider. Acknowledge according to the licence to use Copernicus Products.

Cite NGCD as indicated on the link to "Citation" under References on the Overview page of NGCD.

MET Norway data

The Norwegian data is freely available from MET Norway via frost.met.no.

ECA&D

We acknowledge the data providers in the ECA&D project. Klein Tank, A.M.G. and Coauthors, 2002. Daily dataset of 20th-century surface air temperature and precipitation series for the European Climate Assessment. Int. J. of Climatol., 22, 1441-1453. Data and metadata available at https://www.ecad.eu

Appendix A - Evaluation of version 18.03

The evaluations presented in this Appendix for Nordic Gridded Climate Dataset (NGCD): Product User Guide and Specification (PUGS) and Nordic Gridded Climate Dataset (NGCD): Product User Guide and Specification (PUGS) are based on NGCD version 18.03 and the data used covers the 46-year period from 1971 to 2016.

Temperature

The strategy used for evaluation is cross-validation by means of a leave-one-out procedure. In this way, NGCD is evaluated against independent observations, which are used as reference values. NGCD values are interpolated to the observation locations using a bilinear interpolation. The score considered in the evaluation is the root mean square error (RMSE) based on the deviations between estimated and reference values.

In Figures A1-A3, the RMSE for the daily minimum temperature (TN) is used to compare the two NGCD types. In Figure A1, the results have been aggregated over space and time to show the mean seasonal cycle. In Figures A2-A3, the results are aggregated over time at each observation location, such that the spatial patterns are highlighted.

Figure A1 shows that NGCD-2 has the lowest RMSE, in particular during the cold season when small scale spatial variability of the temperature fields is higher than for the rest of the year (e.g. temperature inversions occur more often during winter). The differences among the two types are smaller during the summer months.

Figure A2 shows the RMSE spatial variability in January, while Figure A3 shows the same quantity for August. In January, both NGCD-types have the smallest RMSE values over the flat regions in Southern Sweden and Finland. Furthermore, RMSE is low along the coasts. The RMSE is larger in the mountainous regions. NGCD-2 shows smaller RMSE than NGCD-1 in these regions. In August (Figure A3), the RMSE values are smaller than during winter. For NGCD-1, there are a few stations that have a rather high RMSE. The results for NGCD-2 show less variability and there are fewer stations with large RMSE. For both types, the stations with the largest RMSEs are spatially scattered indicating that their locations and observed minimum temperatures are influenced by local effects, which are not reflected in the regional minimum temperature signal.


Figure A1: Daily minimum temperature (TN): boxplots of the mean seasonal cycle, for the 46-year period from 1971 to 2016, of mean RMSE (oC) based on leave-one-out cross-validation for NGCD-1 (black boxes) and NGCD-2 (red boxes).

Figure A2: Daily minimum temperature (TN): RMSE (oC) station-by-station averaged over all January months for the 46-year period from 1971 to 2016: NGCD-1 (left) and NGCD-2 (right).

Figure A3: Daily minimum temperature (TN): RMSE (oC), same as FigureA2 but for the month of August.

The analysis of the RMSE for daily maximum temperature (TX) is shown in Figures A4-A6. In Figure A4, the seasonal variations of the RMSE for TX show a different pattern than for TN (see Figure A1). In the case of TX, the RMSE is higher in summer than during spring or autumn. NGCD-2 shows smaller RMSE values than NGCD-1 in all months. For NGCD-1, RMSE is highest in winter. For NGCD-2, the RMSE medians have comparable values during winter and summer, though the corresponding inter-quartile ranges (i.e. the box widths) are larger in winter.

The spatial variations of RMSE for TX are shown in Figures A5-A6 and they show a similar pattern as for TN in Figures A2-A3. The largest RMSEs occur in the inland and/or mountain regions of Norway, Sweden and Finland. In these regions the RMSE variability is also greater than elsewhere. The flat regions in southern Sweden and Finland have low RMSE values. Overall, NGCD-2 shows lower RMSEs than NGCD-1 both in winter (Figure A5) and in spring (Figure A6).

Figure A4: Daily maximum temperature (TX): boxplots of the mean seasonal cycle, for the 46-year period from 1971 to 2016, of mean RMSE (oC) based on leave-one-out cross-validation for NGCD-1 (black boxes) and NGCD-2 (red boxes).

Figure A5: Daily maximum temperature (TX): RMSE (oC) station-by-station averaged over all January months for the 46-year period from 1971 to 2016: NGCD-1 (left) and NGCD-2 (right).

Figure A6: Daily maximum temperature (TX): RMSE (oC), same as FigureA5 but for the month of April.

In Figures A7-A9, the RMSE comparison is performed for the daily mean temperature (TG), similarly to that above for TN and TX.

The results we found for TG are quite similar to those for TN and TX. In Figure A7, RMSE shows the same strong seasonal cycle we found for TN in Figure A1, though for TG the RMSE values are smaller than for TN. NGCD-2 is generally performing better (i.e. with a smaller RMSE) than NGCD-1. The spatial variability reveals the same patterns as for TN and TX. In Figure A8, it is shown that during winter, in continental and mountainous parts of the domain, the RMSE is high and there are large spatial variations over small distances. On the other hand, Figure A9 shows that in late summer the RMSE is characterized by low values and small spatial variations. Once again, NGCD-2 is generally performing better (i.e. with a smaller RMSE) than NGCD-1.


Figure A7: Daily mean temperature (TG): boxplots of the mean seasonal cycle, for the 46-year period from 1971 to 2016, of mean RMSE (oC) based on leave-one-out cross-validation for NGCD-1 (black boxes) and NGCD-2 (red boxes).


Figure A8: Daily mean temperature (TG): RMSE (oC) station-by-station averaged over all January months for the 46-year period from 1971 to 2016: NGCD-1 (left) and NGCD-2 (right).

Figure A9: Daily mean temperature (TG): RMSE (oC), same as Figure A8 but for August.

Precipitation

For precipitation the cross-validation study is carried out on a selection of 50 stations (Figure A10), which have not been used in the production of NGCD. This cross-validation approach differs from the one used for temperature because all 50 stations are reserved for evaluation simultaneously and not one at a time, as in leave-one-out cross-validation. The estimates are compared with the independent observations by means of standard verification scores like probability of detection (POD), false alarm rate (FAR, also known as probability of false detection or POFD), equitable threat score (ETS) and bias score. We refer to the page of the 7th International Verification Methods Workshop for the score definitions.

The POD, FAR, ETS and bias score are dimensionless quantities. They are used for the evaluation of dichotomous (yes/no) predictions, therefore they have been applied to events like "precipitation is higher than X mm", where X is the daily precipitation amount (i.e. mm/day) used as threshold for the event definition.



Figure A10: The station locations used for the cross-validation of precipitation are marked with red dots. The black dots show the stations used to produce precipitation in NGCD version 18.03.

In Figure A11, the POD is shown. The boxplot medians show that NGCD-1 has a higher hit-rate (i.e. correctly predict a "yes" event) than NGCD-2, except for the highest threshold of 25 mm. It is worth remarking that the POD spread for type 1 is larger than for type 2. In Figure A12, it is shown that the risk of having false alarms (i.e. an observed "no" events incorrectly predicted as "yes") is higher for NGCD-2 than for NGCD-1, for all thresholds. The ETS measures the fraction of observed events that were correctly predicted, adjusted for hits associated with random chance. Figure A13 shows that NGCD-1, on average, performs better than NGCD-2 in terms of ETS. Once again, the ETS boxplots for NGCD-2 are characterized by a narrower spread than those of NGCD-1, which indicates that the performances of NGCD-2 are more stable than those of NGCD-1. The bias, shown in Figure A14, is a measure of the accuracy of the spatial analysis method. NGCD-1 underestimates precipitation for all thresholds, while NGCD-2 shows a good fit (i.e. median close to 1) for the smaller thresholds and underestimates the larger precipitation values. NGCD-2 also shows less spread for the large values of precipitation.

Referring to the boxplot medians shown in Figures A11-A14, NGCD-1 often performs better than NGCD-2. We conclude that NGCD-1 provides more accurate estimates of the independent, validating observations than NGCD-2. A possible explanation is that the triangulation technique used for NGCD-1 is a more local interpolation method than the method used for NGCD-2, which results in a larger degree of spatial smoothing.


Figure A11: Probability of detection (POD, dimensionless) of daily precipitation totals (RR) for events exceeding given thresholds (0.1,0.5,1,5,10 and 25 mm) for NGCD-1 (black boxes) and NGCD-2 (red boxes).

 

Figure A12: Probability of false detection (FAR, dimensionless) of daily precipitation totals (RR) for events exceeding given thresholds (0.1,0.5,1,5,10 and 25 mm) for NGCD-1 (black boxes) and NGCD-2 (red boxes).


Figure A13: Equitable threat score (ETS, dimensionless) of daily precipitation totals (RR) for events exceeding given thresholds (0.1,0.5,1,5,10 and 25 mm)  for NGCD-1 (black boxes) and NGCD-2 (red boxes)

 

Figure A14: Bias score (dimensionless) of daily precipitation totals (RR) for events exceeding given thresholds (0.1,0.5,1,5,10 and 25 mm) for NGCD-1 (black boxes) and NGCD-2 (red boxes).

Appendix B - Example file structure

Example file structure can be seen directly from the web-browser at the following URLs.

References

Klein Tank, A. M., Wijngaard, J. B., Können, G. P., Böhm, R. , Demarée, G. , Gocheva, A. , Mileta, M., Pashiardis, S. , Hejkrlik, L. , Kern‐Hansen, C. , Heino, R. , Bessemoulin, P. , Müller‐Westermeier, G. , Tzanakou, M. , Szalai, S. , Pálsdóttir, T. , Fitzgerald, D. , Rubin, S. , Capaldo, M. , Maugeri, M. , Leitass, A. , Bukantis, A. , Aberfeld, R. , van Engelen, A. F., Forland, E. , Mietus, M. , Coelho, F. , Mares, C. , Razuvaev, V. , Nieplova, E. , Cegnar, T. , Antonio López, J. , Dahlström, B. , Moberg, A. , Kirchhofer, W. , Ceylan, A. , Pachaliuk, O. , Alexander, L. V. and Petrovic, P. (2002), Daily dataset of 20th‐century surface air temperature and precipitation series for the European Climate Assessment. Int. J. Climatol., 22: 1441-1453. doi:10.1002/joc.773

This document has been produced in the context of the Copernicus Climate Change Service (C3S).

The activities leading to these results have been contracted by the European Centre for Medium-Range Weather Forecasts, operator of C3S on behalf of the European Union (Delegation agreement signed on 11/11/2014). All information in this document is provided "as is" and no guarantee or warranty is given that the information is fit for any particular purpose.

The users thereof use the information at their sole risk and liability. For the avoidance of all doubt , the European Commission and the European Centre for Medium - Range Weather Forecasts have no liability in respect of this document, which is merely representing the author's view.

Related articles

  • No labels