Contributors:  Jacqueline Bannwart (University of Zurich), Inés Dussailant (University of Zurich), Frank Paul (University of Zurich), Michael Zemp (University of Zurich)

Issued by: UZH / Frank Paul

Date: 12/10/2023

Ref: C3S2_312a_Lot4.WP1-PDDP-GL-v2_202306_A_PQAD-v5_i1.1

Official reference number service contract: 2021/C3S2_312a_Lot4_EODC/SC1

Table of Contents

History of modifications

Itera-tion

Date

Description of modification

Chapters / Sections

i1.0

22/06/2023

Update of document version 4 to version 5 – the document now covers also the RGI7.0. No major changes to the text.

All

i1.1

12/10/2023

Document amended in response to independent review

All

List of datasets covered by this document


Deliverable ID

Product title

Product type (CDR, ICDR)

CDS version number

Comment

Delivery date

WP2-FDDP-A-CDR-v4

Glacier Area – CDR v4.0

CDR

v6.0

Brokered from RGI 6.0

31/12/2021

WP2-FDDP-A-CDR-v4

Glacier Area Raster – CDR v4.0

CDR

v6.0

Created from RGI 6.0

31/12/2022

WP2-FDDP-A-CDR-v5

Glacier Area – CDR v5.0

CDR

v7.0

Brokered from RGI 7.0

31/12/2023

WP2-FDDP-A-CDR-v5

Glacier Area Raster – CDR v5.0

CDR

v7.0

Created from RGI 7.0

31/12/2023

Related documents

Reference ID

Document

[RD1]

Paul, F. et al. (2024) C3S Glacier Area Version 7.0: Product User Guide and Specification (PUGS). Document ref. C3S2_312a_Lot4.WP2-FDDP-GL-v2_202312_A_PUGS-v5_i1.2

[RD2]

Paul, F. et al. (2024) C3S Glacier Area: Product Quality Assessment Report (PQAR). Document ref. C3S2_312a_Lot4.WP2-FDDP-GL-v2_202312_A_PQAR-v5_i1.2

Acronyms 

Acronym

Definition

ASTER

Advanced Spaceborne Thermal Emission and Reflection Radiometer

C3S

Copernicus Climate Change Service

CDR

Climate Data Record

CDS

Climate Data Store

csv

Comma separated values

DEM

Digital Elevation Model

ECV

Essential Climate Variable

GCOS

Global Climate Observing System

GIS

Geographic Information System

GLIMS

Global Land Ice Measurements from Space initiative

ICDR

Interim Climate Data Record

PQAR

Product Quality Assessment Report

RGI

Randolph Glacier Inventory

SPOT

Satellites Pour l'Observation de la Terre

General definitions 

Brokered data set: A dataset that is made available in the Climate Data Store (CDS) but freely available (under given license conditions) from external sources. In the case of the glacier distribution service, the Randolph Glacier Inventory (RGI) is brokered for the CDS from https://glims.org/RGI under a CC-BY 4.0 license.

Debris-cover: Debris on a glacier is usually composed of unsorted rock fragments with highly variable grain size (from mm to several m). These might cover the ice in lines of variable width separating ice with origin in different accumulation regions of a glacier (so called medial moraines) up to a complete coverage of the ablation region. Automated mapping of glacier ice is only possible when the debris is not covering the ice completely when compared to the pixel size of the satellite image, i.e. some clean ice must be visible too.

Glacier area: The area (or size) of a glacier, usually given in the unit km2. Also used by the Global Climate Observing System (GCOS) to name the related Essential Climate Variable (ECV) product.

Glacier outline: A vector dataset with polygon topology marking the boundary of a glacier.

Glacier inventory: A compilation of glacier outlines with associated attribute information.

Glacier complex: A contiguous ice mass that is the result of the binary (yes/no) glacier classification after conversion from raster to vector format. Usually, the glacier complexes are divided into individual glaciers by digital intersection with a vector layer containing hydrologic divides derived from watershed analysis of a digital elevation model (DEM).

Geographic Information System (GIS): A software to visualize, process and edit spatial datasets in vector and raster format.

Scope of the document

This document is the Product Quality Assurance Document (PQAD) for the Copernicus Climate Change Service (C3S) glacier distribution service. It describes the validated products (glacier outlines), the datasets and methods used for validation and uncertainty assessment, and provides an overarching view on uncertainties derived for the dataset in the Climate Data Store (CDS), the Randolph Glacier Inventory (RGI).  

Executive summary

The glacier distribution datasets (RGI v5.0, v6.0 and v7.0) as delivered to the CDS, consists of a global compilation of glacier outlines derived from air and space-borne sensors or maps. Each dataset in this compilation has been quality checked and corrected by the analyst providing it, usually by on-screen digitizing to remove systematic errors of the classification. The classical concept of validation as applied to global satellite products (e.g. Müller 2014 and references therein, Wan et al. 2004), hence, cannot be applied. Instead, the uncertainty of the glacier outlines is determined from a range of methods that can be ranked by the workload required to apply them.

In this document we first provide an overview on the validated products (Section 1) and what ‘validation’ means in this context before we describe in Section 2 the characteristics of the datasets used for validation. In the main part of Section 3 we introduce the various sources of uncertainty and describe the methods used to determine uncertainty. A collective summary of the uncertainty assessment for datasets in the RGI is provided in Section 4, the uncertainty for the datasets produced by C3S are provided in the PQAR [RD2].

1. Validated products

As a general remark, we have to distinguish (A) the product that is provided to the Climate Data Store (CDS) from (B) the products that are created by the C3S glacier distribution service. The dataset provided to the CDS is the latest version of the Randolph Glacier Inventory (RGI). It contains exactly one outline and related attribute information (e.g. topographic data) in an open vector format (shape file) along with information about glacier hypsometry (area-elevation distribution) in an additional csv file (see the Product User Guide and Specification [RD1] for details) for all of 215,000 glaciers globally. The dataset is extracted from the Global Land Ice Measurements from Space (GLIMS) glacier database (https://glims.org), which is multi-temporal and might contain several outlines for the same glacier, usually from different points in time. The GLIMS database contains datasets provided by the glacier mapping community using a range of methods and datasets (e.g. satellite images, aerial photography, topographic maps) to derive them. These outlines are thus very diverse in quality and characteristics. The products created by the C3S glacier distribution service cover glacierized sub-regions (e.g. glaciers on Baffin Island) and are also submitted to GLIMS from where they are possibly extracted for the RGI (https://glims.org/RGI1). We are thus a part of the glacier mapping community feeding the GLIMS database.

We also have to clarify that glacier outlines are in general not validated in a traditional sense. This is due to the fact that appropriate validation data (i.e. higher resolution datasets from about the same date) are seldom available or too expensive. Hence, only measures determining uncertainties (random errors) are usually applied and reported (Section 3). On the other hand, all glacier outlines are quality checked and corrected against the satellite data from which they are derived. This correction of omission and commission errors removes the systematic errors of automated glacier mapping (this is not required for fully manual digitizing). For example, lakes are misclassified as glaciers and have to be removed, whereas debris-covered glacier parts or regions in shadow might have been missed and have to be added. Hence, all glacier outlines (in the RGI and as provided to GLIMS) are validated and corrected against reference data, i.e. the images they are derived from. This ‘standard validation’ removes systematic errors, but does not have a quantitative measure.

Validation against external datasets (e.g. higher resolution imagery) is applied at three levels:

(1) visual inspection to improve interpretation (for outline correction);
(2) if orthorectified, direct digitizing of outlines (for outline correction);
(3)
independent digitizing to create a reference dataset for 'real' validation, i.e. accuracy assessment.

Points (1) and (2) are part of the standard validation and only (3) can be seen as a ‘real’ validation, at least technically (if the datasets are really appropriate for validation is another issue). All glacier outlines are subject to a ’standard validation’, but the effect is in general not quantified. On the other hand, a ‘real’ validation is only rarely performed and results might be very specific for the region. What remains to be reported for all datasets are uncertainties, e.g. resulting from a variable digitizing when interpreting glacier features. A range of methods have been developed for this purpose (see Section 3) and results are usually reported in related publications. For the RGI as a whole we present generalized results of the uncertainty assessment in Section 4 and, for the datasets created by C3S, uncertainties are reported in the PQAR [RD2].

1 The URL resources last viewed 22nd June 2023

2. Description of validating datasets

 As described in Section 1, the datasets used for the standard validation are:

(a) the satellite images used for creating the original glacier outlines (usually Landsat, Sentinel-2 or Advanced Spaceborne Thermal Emission and Reflection Radiometer (ASTER), sometimes also Satellites Pour l'Observation de la Terre (SPOT) and other multispectral optical sensors);
(b) higher resolution datasets available from Google Earth, Bing, and national mapping agencies (for improved visual interpretation);
(c) orthorectified and geocoded images provided by web-map services for direct digitizing.

The orthorectified images from (a) and (c) can be directly imported into a Geographic Information System (GIS) and used as a background for correcting the glacier outlines. The satellite images used to create glacier outlines and perform the standard validation differ in spatial resolutions (in general 10 to 30 m), spectral bands and radiometric resolution. This means that visibility of details and possibilities for contrast enhancement varies for each sensor. Moreover, mapping conditions (e.g. clouds, shadow, seasonal snow) change from image to image and the analysts performing the corrections have differing experience. In effect, the corrected / validated datasets also have differing quality. The quality can be slightly improved using visual inspection of higher resolution images (b), but only corrections using type (c) images can improve outline quality substantially, at least when the available images are of good quality (e.g. regarding snow and cloud conditions).

The major benefit of type (c) images is their much higher spatial resolution (up to 30 cm), which allows a much better interpretation of glaciological and geomorphological features, in particular of debris-covered glaciers. However, there are also disadvantages, e.g. the image is fixed and can have adverse snow conditions, other band combinations or contrast enhancement cannot be applied, and sometimes the geolocation is shifted or has artefacts. This might limit their applicability for individual glaciers or smaller regions. If, however, high quality datasets are available that match to the acquisition date of the satellite images used to create the outlines, they can be used for both direct digitizing of glacier extents and/or creating an independent reference dataset that can then be used for a ‘real’ validation (e.g. Paul et al. 2013, Andreassen et al. 2022). For the datasets created by C3S, we will report details of the validation datasets used along with the results of the uncertainty assessment in the PQAR [RD2].    

3. Description of product validation methodology

As mentioned in Section 1, glacier outlines are in general not validated (against a reference dataset), but instead an uncertainty assessment is performed. Uncertainty has three main sources: 

(a) the geo-location uncertainty,
(b) the digitizing uncertainty and
(c) the interpretation uncertainty.

The cause and consequences of these (and further) uncertainty sources are summarized in Table 1 and described in detail in this section. Figure 2 illustrates how the measures to determine uncertainty are connected to the above sources of uncertainty and some other details.  

Table 1: Overview of the uncertainty sources when digitizing glacier extents and their impact on glacier size.

Uncertainty source

Examples

Consequences

Digitizing

Each digitization by the same analyst will place the outline at a different place

A 3-5% variability of the resulting glacier area can be expected

Interpretation

Different analysts will interpret the features to be included differently

A 5-10% variability in the resulting glacier area can be expected

Identification

Excluding debris-covered ice

If missed, glaciers can be too small by up to 50%

Image conditions

Snow cover or clouds might hide the glacier perimeter, ice in shadow might be difficult to identify

Snow cover could increase glacier size by 50% or more, clouds/shadow can hide glaciers completely

Methodological Differences

Purpose dependent, e.g. rock glaciers, perennial snow fields, steep accumulation regions might be included or excluded

Highly variable, but could be 50% of the area


As the geo-location uncertainty (a) has no direct impact on the derived glacier area (but large impacts when used together with other geocoded datasets), we focus here on the digitizing (b) and interpretation (c) uncertainties. Both are calculated using a range of methods listed below under points (1) to (4). They are used by us in C3S as well as by the analysts that have provided glacier outlines to the RGI. We have ranked the available methods in terms of the effort required to perform the assessment from (1) low to 4 (high). The low-effort methods are:

(1) using an uncertainty value derived by earlier, more detailed studies (e.g. Paul et al. 2013);
(2) the so-called 'buffer method'.


Method (1) typically assumes an overall uncertainty of the derived glacier areas of 3 to 5% and method (2) calculates a potential minimum and maximum area with a buffer of ±½ or ±1 image pixel from the existing outlines. This buffer value has been derived from more detailed investigations that placed outlines derived by various analysts on top of each other to reveal the variability in interpretation (Figure 2).

Whereas the buffer method works well on individual glaciers, it overestimates uncertainties for glacier complexes when these are already split into individual glaciers, as shared boundaries do not contribute to the uncertainty. Hence, internal ice divides should be removed before the buffer method is applied. Apart from this, the method shows a strongly increasing uncertainty towards smaller glaciers, as the fraction of perimeter pixels increases. As smaller glaciers produce larger uncertainties with this method, this aspect needs to be considered when assessing small numbers of glaciers together, as the higher uncertainty from the smaller glaciers can skew the mean uncertainty of the whole sample. The same applies when the sample is dominated by larger glaciers, but then in the other direction as larger glaciers produce smaller uncertainties. Hence, the buffer method will give a realistic mean value for a large sample, but for individual glaciers uncertainties can be smaller or larger.

Two further methods are applied to determine uncertainties:

(3) independent multiple digitising of a small glacier sample, and
(4) comparison with glacier extents obtained from higher resolution datasets.


Whereas method (4) provides a measure of accuracy, method (3) provides the digitizing uncertainty (b) when performed by the same person and the interpretation uncertainty (c) when applied by different persons to the same sample of glaciers. Method (3) can be applied independently of validation data and is likely the most accurate method to estimate uncertainty as it considers the performance of the analyst(s) responsible for correcting the outlines and thus introducing the uncertainties. In Figure 2a we show an example of the digitizing uncertainty and in Figure 2b of the interpretation uncertainty where five analysts corrected the same sample of glaciers. As can be seen, the latter has a larger variability (and thus uncertainty) than the former. The digitizing was performed on the original 10 m resolution Sentinel-2 images used for an updated alpine-wide glacier inventory (Paul et al. 2020).


Figure 1: Methods for uncertainty assessment and how they are connected to uncertainty sources.


a) 
bFigure 2: Multiple digitizing experiment for a sample of glaciers in southern Switzerland by a) the same analyst (each coloured line represents one round of digitizing) and b) by five different analysts (all lines refer to their respective third digitizing). Image width is 12.2 km, north is up (Copernicus Sentinel data 2015).

Even when reference datasets are available for method (4), uncertainties can arise in the manual digitising, and the area value that is finally used to determine the accuracy is a mean value of at least three (better five) independent digitisations. Figure 3 is showing an example of such a validation with a reference dataset. As this requires considerable extra work (in particular for a larger sample of glaciers), method (4) has only rarely been used for accuracy assessment (e.g. Andreassen et al. 2022, Fischer et al. 2014). On the other hand, method (3) is increasingly used by the community (Fischer et al. 2014, Guo et al. 2015, Paul et al. 2020). The specific method among the four methods presented here that has been used for the datasets in the RGI might be determined from the respective publications (but not all include uncertainty information). This effort has been made collectively for the paper describing the RGI (Pfeffer et al. 2014) and we report in Section 4 on the results of this overarching assessment.

 
Figure 3: Accuracy assessment for a small glacier in the Swiss Alps. The white outline is derived from 30 m Landsat TM data (this is the length of the shortest line segment), whereas the coloured outlines represent different digitisations of the glacier extent by different analysts using the high-resolution (50 cm) aerial image shown in the background (North is at top). Variability in the coloured outlines is, in general, not more than 1 Landsat pixel in width (image taken from Paul et al. 2013).

For individual glaciers, or even for entire regions, much larger uncertainties are present in the RGI than those introduced by the 'standard validation' when correcting regions in shadow or ice under debris cover (see Table 1). Likely the largest one, on a regional scale, refers to image conditions and results from using scenes with adverse snow conditions for glacier mapping so that snow rather than glacier ice is mapped, and glacier extent is largely overestimated (commission error). This is still a major issue for glacier outlines in the Andes (see Figure 4) and to a lesser extent in most other regions. On a somewhat smaller scale, local clouds might cover the real glacier extent in all available images leading to a local underestimation (omission error) of glacier area, if not corrected. Ice and snow in shadow can both be missed or wrongly added when image contrast is not sufficient. On a local scale (individual glaciers), missed debris cover (omission error) and inclusion of pro-glacial lakes (commission error) creates the largest uncertainties in glacier area. Andreassen et al. (2022) presents several examples of the impacts of clouds, snow, shadow and lakes on glacier classification. Apart from incorrectly mapped snow cover, the errors introduced by the other uncertainties might average out at a somewhat larger scale so that the total glacier-covered area for a larger region might still be correct.

Figure 4: Wrong glacier areas in Bolivia due to adverse snow conditions in RGI6 (black lines). The red outlines mark the real (very small) glaciers. The background image is a subset from scene 3-71 acquired by Landsat 5 on 26.05.1998 in false colours. Image width is 18.5 km, north is at top. Image: earthexplorer.usgs.gov URL resource last viewed 22nd June 2023.

A final class of uncertainties is related to methodological differences. These are difficult to quantify as they are more differences of opinion rather than real errors or uncertainties, i.e. large differences in glacier area can occur without one of them being wrong. Typical examples are related to the inclusion of rock glaciers (that are often difficult to discriminate from debris-covered glaciers) or the interpretation of perennial snowfields (with possible ice underneath) at high elevations in the Himalaya (near mountain crests) or at low elevations in polar regions (in topographically protected niches). Considering these features or not can regionally have an impact on the mapped total glacier area. In this regard, it is also important to consider that glacier inventories are often created for a specific purpose. Whereas rock glaciers and perennial snow fields might be considered for an overall hydrologic assessment (water resources), it would be better to exclude them when the purpose is detection of climate change impacts or a strict glaciological assessment (glaciers must flow by definition whereas rock glaciers creep). In view of the datasets provided to the CDS (the RGI 5.0 and 6.0), one has to be aware that the glacier outlines are a mix of all of the above. They include poorly mapped regions (missing debris-covered glaciers, including lakes), outlines including seasonal and perennial snow, as well as rock glaciers, missed regions in shadow or under clouds, and glaciers missing their accumulation regions. The work in C3S will improve on these issues.

4. Summary of validation results

In this section we provide results of an overarching accuracy assessment for the baseline dataset (the RGI) provided to the CDS. This assessment summarises the results from the individual studies that contributed datasets to the global product (Pfeffer et al. 2014). The results obtained for the datasets created by C3S2 will be presented in the PQAR [RD2].

For the mapped glacier areas in the RGI, the increase in uncertainty towards smaller glaciers is clearly visible in Figure 5. The graph has been created from published estimates of uncertainty for single glaciers and glacier complexes. As described above, most studies have used the buffer method for uncertainty assessment that tends to give too high values for small glaciers. However, for glaciers larger than 1 km2 uncertainties are in general <5%.

Figure 5. Published estimates of the uncertainty of area measurements of single glaciers (diamonds) and collections of glaciers (dots). Solid line: best-fitting relationship between measured area and its standard error. Dashed line: relationship adopted for estimation of RGI errors (from Pfeffer et al. 2014).

References

Andreassen, L.M., Nagy, T., Kjøllmoen, B., Leigh, J.R. (2022). An inventory of Norway's glaciers and ice-marginal lakes from 2018–19 Sentinel-2 data. Journal of Glaciology 1–22. https://doi.org/10.1017/jog.2022.20 (last accessed 22nd June 2023)

Fischer, M., Huss, M., Barboux, C. and Hoelzle, M. (2014). The new Swiss Glacier Inventory SGI2010: Relevance of using high- resolution source data in areas dominated by very small glaciers, Arctic, Antarctic and Alpine Research, 46, 935-947, DOI: 10.1657/1938-4246- 46.4.933

Guo, W.Q., S.Y. Liu, J.L. Xu, L.Z. Wu, D.H. Shangguan, X.J. Yao, J.F. Wei, W.J. Bao, P.C. Yu, Q. Liu and Z.L. Jiang (2015). The second Chinese glacier inventory: data, methods and results. Journal of Glaciology, 61(226), 357-372. DOI: 10.3189/2015JoG14J209

Müller, R. (2014). Calibration and Verification of Remote Sensing Instruments and Observations. Remote Sensing, 6(6), 5692–5695. DOI: 10.3390/rs6065692

Paul, F., N. Barrand, E. Berthier, T. Bolch, K. Casey, H., rey, S.P. Joshi, V. Konovalov, R. Le Bris, N. Mölg, G. Nosenko, C. Nuth, A. Pope, A. Racoviteanu, P. Rastner, B. Raup, K. Scharrer, S. Steffen and S. Winsvold (2013): On the accuracy of glacier outlines derived from remote sensing data. Annals of Glaciology, 54 (63), 171-182. DOI: 10.3189/2013AoG63A296

Paul, F., Rastner, P., Azzoni, R.S., Diolaiuti, G., Fugazza, D., Le Bris, R., Nemec, J., Rabatel, A., Ramusovic, M., Schwaizer, G., and Smiraglia, C. (2020): Glacier shrinkage in the Alps continues unabated as revealed by a new glacier inventory from Sentinel-2. Earth Systems Science Data, 12(3), 1805-1821. DOI: 10.5194/essd-12-1805-2020

Pfeffer, W. T., Arendt, A. A., Bliss, A., Bolch, T., Cogley, J. G., Gardner, A.S., … Sharp, M. J. (2014). The Randolph Glacier Inventory: a globally complete inventory of glaciers. Journal of Glaciology, 60(221), 537-552. DOI: 10.3189/2014JoG13J176

RGI Consortium (2017): RGI consortium: Randolph Glacier Inventory – A Dataset of Global Glacier Outlines: Version 6.0, GLIMS Technical Report, 71 pp., available at: http://glims.org/RGI/00_rgi60_TechnicalNote.pdf (last accessed 22nd June 2023)

Wan, Z., Zhang, Y., Zhang, Q. and Li, Z.-L. (2004). Quality assessment and validation of the MODIS global land surface temperature. International Journal of Remote Sensing, 25(1), 261-274. DOI: 10.1080/0143116031000116417 


This document has been produced in the context of the Copernicus Climate Change Service (C3S).

The activities leading to these results have been contracted by the European Centre for Medium-Range Weather Forecasts, operator of C3S on behalf of the European Union (Contribution agreement signed on 22/07/2021). All information in this document is provided "as is" and no guarantee or warranty is given that the information is fit for any particular purpose.

The users thereof use the information at their sole risk and liability. For the avoidance of all doubt , the European Commission and the European Centre for Medium - Range Weather Forecasts have no liability in respect of this document, which is merely representing the author's view.

Related articles