Contributors: L. Carrea (University of Reading), C.J. Merchant (University of Reading), B. Calmettes (CLS)

Issued by: L. Carrea, C.J. Merchant

Date: 21/04/2021

Ref:C3S_312b_Lot4_D2.LK.1-v3.0_LSWT_Product_Quality_Assurance_Document_i1.0

Official reference number service contract: 2018/C3S_312b_Lot4_EODC/SC2

Table of Contents

History of modifications

Issue

Date

Description of modification

Author

i0.1

28/01/2021

The present document was modified based on the document with deliverable ID: C3S_312b_Lot4_D2.LK.1-v2.0_202001_Product_Quality_Assurance_Document_LSWT_v1.0.
Updated the document to include LSWT v4.2 for the CDR v3.0 adding the extension with SLSTR on Sentinel3A and Sentinel3B LSWT data.

LC

i1.0

21/04/2021

Finalised

RK

List of datasets covered by this document

Deliverable ID

Product title

Product type (CDR, ICDR)

C3S version number

Public version number

Delivery date

D3.LK.3-v3.0 

Lake Surface Water Temperature

CDR

V3.0

LSWT-4.2

31/01/2021

Related documents

Acronyms

Acronym

Definition

ATBD

Algorithm Theoretical Basis Document

ATSR

Along Track Scanning Radiometer

AATSR

Advanced Along Track Scanning Radiometer

AVHRR

Advanced Very-High Resolution Radiometer

BLI

Balaton Limnological Institute

C3S

Copernicus Climate Change Service

CARRTEL

Centre Alpin de Recerche sur le Réseaux Trophique des Ecosystèmes Limniques

CCI

Climate Change Initiative

CDR

Climate Data Records

CF

Climate and Forecast

CLS

Collecte Localisation Satellites

ECV

Essential Climate Variable

EODC

Earth Observation Data Centre

EPSCOR

Established Program to Stimulate Competitive Research

ERS

European Remote Sensing

ESA

European Space Agency

EU

European Union

EUSTACE

EU Surface Temperature for All Corners of Earth

FOC

Fisheries and Oceans Canada

GCOS

Global Climate Observing System

GHRSST

Group for High Resolution Sea Surface Temperature

GLEON

Global Lake Ecological Observatory Network

GLERL

Great Lakes Environmental Research Lab

ICDR

Interim Climate Data Record

KDKVI

Central Transdanubian (Regional) Inspectorate for Environmental Protection, Nature Conservation and Water Management

KU

Katholieke Universiteit

L3C

Level 3 Collated

L3S

Level 3 Super-collated

L3U

Level 3 Un-collated

LEGOS

Laboratoire d'Etudes en Géophysique et Océanographie Spatiales

LK

Lake

LSWT

Lake Surface Water Temperature

LTER

Long-Term Ecological Research

NDBC

National Data Buoy Centre

NERC

Natural Environment Research Council

NTL

North Temperate Lakes

PUGS

Product User Guide and Specifications

QL

Quality Level

RSD

Robust Standard Deviation

SD

Standard Deviation

SLSTR

Sea and Land Surface Temperature Radiometer

SLU

Swedish University of Agricultural Science

SYKE

Finnish Environment Institute

UGLOS

Upper Great Lakes Observing System

UMR

Unité Mixte de Recherche

General definitions

L2P – Geophysical variables derived from Level 1 source data on the Level 1 grid (typically the satellite swath projection). Ancillary data and metadata added following GHRSST Data Specification.

L3U – Level 3 Un-collated data are L2 data granules remapped to a regular latitude/longitude grid without combining observations from multiple source files. L3U files will typically be "sparse", corresponding to a single satellite orbit.

L3C – Level 3 Collated data are observations from a single instrument combined into a space-time grid. A typical L3C file may contain all the observations from a single instrument in a 24-hour period.

L3S – Level 3 Super-collated data are observations from more than one satellite that have been gridded together into a single grid-cell estimate, for those periods where more than one satellite data stream delivering the geophysical quantity has been available.

Scope of the document

This document describes the strategies used for validation and characterization of the GloboLakes Lake Surface Water Temperature product (LSWT v4.0, brokered) and the C3S CDR extensions for LSWT (generated CDR). The document includes also a description of the strategies used for quality assurance of the LSWT product prior to production.
This version describes:

  • A summary of the activities carried out for the GloboLakes LSWT-4.0 data (L3S series data). The planning of the validation and its execution was conducted as part of that project.
  • The activities carried out for C3S data production, which includes monitoring of statistics of the data used within the production scheme and validation against a reference dataset.

Executive summary

The C3S Lake production system (C3S ECV LK) provides an operational service, generating lake surface water temperature and lake water level climate datasets for a wide variety of users within the climate change community. The present document covers the lake surface water temperature component.
The GloboLakes LSWT v4.0 data (brokered) and their uncertainties were assessed by that project using an in situ reference dataset collected yearly at the end of each year. Robust statistics (median and robust standard deviation) were used when assessing differences between the reference data and the GloboLakes LSWT data, in order to obtain fair results that are not dominated by outliers and bad data that arise in both the satellite and validation data.
This Product Quality Assurance Document includes the definition and description of the datasets, validation methods and strategies used for the validation and characterization of the accuracy of the Lake Surface Water Temperature product. Regarding the stability of LSWT on multi-annual scales, currently it is not possible to assess it due to the lack of reference data of quantified stability.
This document describes the methodology for the first version of the C3S LSWT products: contractual version 1 of the C3S Climate Data Record extension produced in January 2019; contractual version 2 of the C3S Climate Data Record extension produced in January 2020; and contractual version 3 of the C3S Climate Data Record extension produced in January 2021.

1. Validated products

1.1. Product Specifications

Presently, this section relies on statements for the Lake ECV from GCOS, published literature, experience from other CDR projects, and requirements emerging from the definition of the service. The requirements will be updated in future using requirements that emerge from users of the service and their feedback, and from any user requirements survey that is undertaken in a future ESA CCI project. The user requirements are indicated in Table 1.

Table 1: User Requirements for Lake Surface Water Temperature as described in GCOS

Content of the dataset

Content of the main file

The data file shall contain the following information on separate layers:

  • Lake Surface Water Temperature
  • A measure of the uncertainty

Spatial and temporal features

Spatial coverage

The target lakes shall be distributed globally based on a harmonized identification of the products. The area of the lakes must be at least 1kmx1km

Temporal coverage

Times series of 10 years minimum are required

Temporal resolution

  • A daily product shall be distributed
  • At least a weekly product shall be distributed

Data uncertainties

Threshold

1 K

Target

0.25 K

Format requirements

Format

NetCDF, CF Convention

A detailed description of the product generation is provided in the Algorithm Theoretical Basis Document (ATBD) [D3] with further information on the product given in the Product User Guide and Specifications (PUGS) [D3.LK.5].


1.2. Available products

This document describes the validation of NERC GloboLakes LSWT v4.0 product, including LSWTs and their evaluated uncertainty, and the C3S LSWT CDR extensions which include the processing of AVHRR on MetOpB from January 2017 until end of August 2019 for CDR v2.0 and the processing of SLSTR on Sentinel3A and Sentinel3B from Sep 2019.
The v4.0 (scientific version number) lake surface water temperature product (combining the brokered and extension CDRs) provides a long-term climate data record (CDR) covering 1995 to 2019 (24 years) and the v4.2 lake surface water temperature product provides an extension until Oct 2020 for a total of 25 years. The thermal satellite observations were provided by the following instruments:

  • ATSR2 on ERS-2 from 1995 to 2003
  • AATSR on Envisat from 2002 to 2012
  • AVHRR on MetOpA from 2007 to 2016
  • AVHRR on MetOpA from 2017 to Aug 2019 (C3S generated product only)
  • AVHRR on MetOpB from 2017 to Aug 2019 (C3S generated product only)
  • SLSTR on Sentinel3A from Sep 2019 to 2020 (C3S generated product only)
  • SLSTR on Sentinel3B from Sep 2019 to 2020 (C3S generated product only)

This current document is applicable to the Quality Assessment activities performed on the combined dataset dated in January 2020 (contractual v2). These activities will be reported in the Product Quality Assessment Report [D4].


1.3. Parameters and units

The LSWT product consists of global files in netCDF4 format and it contains:

  • The best estimation of the Lake Surface Water Temperature (LSWTskin) expressed in kelvin
  • The associated uncertainty expressed in kelvin which summarises the radiometric noise and the uncertainty in the retrieval and it is defined in Sec. 3.1 of the ATBD.
  • The associated quality level which captures the confidence in the retrieval and it is defined in Sec. 3.4 of the ATBD.

Additional information is also included in the output file, concerning the lakes identifiers, the instruments used for the observations, and if an inter-sensor adjustment has been applied.

2. Description of validating datasets

A match-up dataset was constructed from the in situ temperature data collected through the ARCLake project, the GloboLakes project and the EU Surface Temperature for All Corners of Earth (EUSTACE) project and the C3S for the extensions. Currently, this dataset consists of 112 observation locations covering 38 of the GloboLakes lakes. Details of the in situ data with their sources are given in Table 2 which reports all locations for the GloboLakes lakes where there are matches. However, new in situ observations will be collected before the release of the Product Quality Assurance Report [D4]

Table 2: List of the in situ measurements sources for the GloboLakes lakes


Source

Lake name (number of locations)

NDBC - National Data Buoy Centre (USA)

Superior (3), Huron (2), Michigan (2), Erie (1), Ontario (1)

FOC - Fisheries and Oceans Canada (Canada)

Superior (1), Huron (4), Great Slave (2), Erie (2), Winnipeg (3), Ontario (4), Woods (1), Saint Claire (1), Nipissing (1), Simcoe (1)

Michigan Technological University (USA)

Superior (2), Michigan (1)

University of Minnesota (USA)

Superior (2),

Northern University of Michigan (USA)

Superior (2),

Superior Watershed Partnership (USA)

Superior (1)

U.S. Army Corps of Engineers (USA)

Superior (1)

GLERL - Great Lakes Environmental Research Lab (USA)

Huron (2), Michigan (1)

University of Wisconsin-Milwaukee (USA)

Michigan (2)

Northwestern Michigan College (USA)

Michigan (1)

University of Michigan CIGLR (USA)

Michigan (2)

Limno Tech (USA)

Michigan (3), Erie (4)

Illinois-Indiana Sea Grant and Purdue Civil Engineering (USA)

Michigan (2)

Irkutsk State University (Russia)

Baikal (1)

Kings College London (UK)

Malawi (1)

Regional Science Consortium (USA)

Erie (1)

UGLOS - Upper Great Lakes Observing System (USA)

Erie (2), Douglas (1)

LEGOS - Laboratoire d'Etudes en Géophysique et Océanographie Spatiales (France)

Issykkul (1)

SLU – Swedish University of Agricultural Science (Sweden)

Vanern (6), Vattern (2), Malaren (10), Hjalmaren (1), Siljan (1), Bolmen (2), Roxen (1)

Uppsala University (Sweden)

Vanern (1), Vattern (1)

KU Leven (Belgium)

Kivu (1)

SYKE – Finnish Environment Institute (Finland)

Saimaa (1), Paijanne (1), Pielinen (1)

Vermont EPSCOR - Established Program to Stimulate Competitive Research (USA)

Champlain (1)

SUNY Plattsburgh Center for Earth and Environmental Science (USA)

Champlain (1)

NIWA (New Zealand)

Taupo (1)

GLEON - Global Lake Ecological Observatory Network

Balaton (1)

BLI – Balaton Limnological Institute (Hungary)

Balaton (6)

KDKVI - Central Transdanubian (Regional) Inspectorate for Environmental Protection, Nature Conservation and Water Management (Hungary)

Balaton (3)

UMR CARRTEL – Centre Alpin de Recerche sur le Réseaux Trophique des Ecosystèmes Limniques (France)

Geneva (1)

UC-Davis Tahoe Environmental Research Center (USA)

Tahoe (1)

Estonian University of Life Sciences (Estonia)

Vorstjarv (1)

Martin Dokulil (Austria)

Neusiedl (1)

Israel Oceanographic and Limnological Research (Israel)

Sea of Galilee (1)

Universidad del Valle de Guatemala (Guatemala)

Atilian (1)

Universitá degli Studi di Perugia (Italy)

Trasimeno (1)

Centre for Ecology and Hydrology - Edinburgh (UK)

Lomond (1), Leven (1)

NTL LTER - North Temperate Lakes Long-Term Ecological Research (USA)

Mendota (1)

The plot of the geographical distribution of the 112 sites over 38 of the 1000 GloboLakes lakes is shown in Figure 1.

As the in situ data are from a variety of sources, with different formats, considerable effort has been put in to consolidate this data to a standard format for use in GloboLakes, and to apply a quality control procedure which was partly automatized and partly by inspection. The quality control procedure was initiated within the ARCLake project and updated within GloboLakes. Moreover, the data have a range of characteristics: the measurements have been taken at different depths up to 1m; the temporal sampling of the measurements ranges from 15 minutes to few times a year; for some locations the measurements are averages while for others they have been taken at the reported time. None of the in situ measurements which have been collected are accompanied by an uncertainty estimation.

Thus, the reference data for satellite LSWT is in a relatively unsophisticated and un-coordinated state internationally, and, as far as we are aware, the ongoing efforts at collecting as much information for LSWT validation as possible within the C3S service will be internationally significant.


Figure 1: Geographical distribution of the in situ measurements location for the GloboLakes lakes

3. Description of product validation methodology

3.1. Overall procedure

The validation exercise consists in validating through independent data the lake surface water temperature retrieved from satellite data. The independent data are described in Section 2 and they are a collection of quality controlled in situ measurements from different institutions. The validation is performed in two phases. First, in situ measurements are matched with satellite observations at L2 where the coordinates are the satellite coordinates. A first check of the agreement between in situ and per sensor satellite observations is performed. Then, the L3 cell correspondent to L2 match is identified and the final validation of the L3S product is performed through robust statistics.

3.2. Generation of L2 matchup database

A per-sensor matchup is created and it contains coincident satellite and in situ data. It also provides the reference and time of the in situ location and the associated LSWTs, quality level and uncertainty from the L2 LSWT product. The matchup is created for satellite observations based on the following criteria:

  • Spatially within 3km from the location of the in situ measurement and
  • Temporally within 3 hours for the in situ measurements where the measurement time was available, otherwise the day was matched

The differences between the satellite LSWT and reference in situ data are analysed using both standard and robust statistics i.e. statistics that are resistant to the presence of outliers in the distribution of differences (if any). Time series of the absolute temperatures together with their difference are generated differentiating the quality levels. Moreover, per-sensor box plots of the difference are produced for each quality level.

3.3. Validation of the L3S GloboLakes LSWT v4.0 and C3S LSWT v4.2

The validation of the L3S GloboLakes LSWT v4.0 and C3S LSWT v4.2 product is carried out on the L3 cell corresponding to the L2 pixel where a match has been found. The differences between the satellite LSWT and reference in situ data are analysed using both standard and robust statistics i.e. statistics that are resistant to the presence of outliers in the distribution of differences (if any). Time series of the absolute temperatures together with their difference are generated differentiating the quality levels. Moreover, per-sensor box plots of the difference are produced for each quality level.

3.4. Validation of the uncertainty

The approach used for the validation of the LSWT uncertainty is to compare the robust standard deviation of differences between the LSWT and the reference data to the combination of the in situ data uncertainty and the uncertainties provided with the LSWT product. Since no information about the uncertainty of the in situ measurements has been provided during their collection, our assumption for in situ uncertainty is ref = 0.2 K, based on experience with water temperature measurements in the context of sea surface temperature. Statistics are generated for different levels of uncertainty ascribed to the LSWT, in order to determine if the uncertainties were valid across the full range of possible uncertainties.

3.5. Assessment of stability

Currently, an assessment of stability (i.e. degree of change of the statistical distributions of error with time) is not known to be possible due to the lack of any reference data whose continuity and long-term stability are well understood.

4. Summary of validation results

This section provides some keys results of the LSWT product validation only for the GloboLakes part of the CDR. The detailed results for the complete CDR will be included in the Product Quality Assessment Report [D4], together with the validation of the LSWT uncertainty.


4.1. Generation of L2 matchup database and validation of the L2 GloboLakes LSWT v4.0 data

The matchup is carried out per sensor and the checks over the L2 LSWT data are reported in the current section. Table 3 reports the robust statistics and the traditional statistics per quality level and per sensor for the matches across all the lakes. The AVHRR sensor observations present the best agreement with the in situ measurements. This may be due to the much higher number of matches because of its larger swath with respect to the ATSRs (ATSRs swath is 500 km and AVHRRs swath is ~2900km). However, considering only data within a swath similar to the ATSRs still confirms that the AVHRR observations present better agreement with the in situ measurements.
The agreement varies according to the quality levels in a way that is expected. The best agreement is for quality levels 4 and 5 that reflect a higher degree of confidence in the validity of the uncertainty estimate. Quality level 3 data comparison with the in situ data shows an acceptable agreement however they have to be used with care.

Table 3: Global validation statistics from comparing L2 GloboLakes LSWT with in situ measurements

Sensor

QL

N

Median

RSD

Mean

SD

ATSR2

5

1428

-0.220

0.489

-0.288

1.522


4

849

-0.370

0.712

-0.573

1.485


3

446

-0.610

1.082

-0.897

1.686


2

131

-1.230

2.179

-1.520

2.045


1

742

-3.0

4.114

-3.862

4.341

AATSR

5

2758

-0.310

0.445

-0.429

0.987


4

1828

-0.460

0.667

-0.615

1.288


3

824

-0.820

1.141

-1.016

1.670


2

290

-1.765

1.660

-1.882

1.906


1

1492

-3.5

4.159

-4.298

4.363

AVHRR-A

5

9558

-0.12

0.504

-0.246

1.117


4

6024

-0.28

0.801

-0.427

1.36


3

6992

-0.31

1.008

-0.559

1.577


2

13621

-0.28

1.038

-0.552

1.640


1

8206

-3.68

5.174

-4.61

5.358

Figure 2 shows plots of the observations for year 1999 for ATSR2, 2006 for AATSR and 2016 for AVHRR on MetOpA for the Lake Superior together with the climatology as a reference. The satellite in situ measurement difference is also displayed where the green line displays the difference for quality level 3, 4 and 5 while the red line for all the quality levels. The plots show that quality level 3 may be useful but with care by users, although in general, use of quality level 4 and 5 is recommended.

Such plots are generated for all locations and years with in situ matches as part of validation for product quality assurance.

Figure 2: Yearly plots for site 01 on lake Superior for the three sensors where the colour of the dots represents the quality levels.

4.2. Validation of the L3S brokered LSWT v4.0 product

Given the matchup database created at L2, the validation of the final LSWT GloboLakes product is carried out on the corresponding L3 cell. Figure 3 reports a box plot of the L3S satellite in situ difference per quality level showing a consistent good agreement for higher quality levels. Quality level 1 and 2 data are reported in the dataset but they are not recommended to be used. Quality level 3 may be useable with care (specific inspection by users), while the LSWT with quality levels 4 and 5 are data we recommend to use.

DIFFERENCE (K)
Figure 3: Box plot of the satellite in situ difference per quality level.


Figure 4: Yearly plot of L3S LSWT for one site over lake Douglas in US for year 2012 where the colour of the dots on the right hand side plot represents the quality level.

Figure 4 shows the location of the in situ measurement site on lake Douglas in US, one of the smallest lakes in the GloboLakes selection. The blue dots represent the centre of the pixels in a 1/120 deg grid. The lake centre (defined in Carrea et al. (2015)) is at a maximum distance to land of 1.5 km. The plot of the L3S LSWT observations according to the quality levels together with the in situ measurements is shown on the right hand side of Figure 4. Their difference is also reported together with the climatology as a reference.

5. Summary of quality assurance prior to data release (C3S extension LSWT v4.0 and LSWT v4.2)


The quality checks performed after the LSWT product is generated consist of the following steps:

  1. Time series of observations minus the climatology statistics where the number of observations per day globally is plotted, together with the mean and standard deviation of the difference
  2. Checks on the max, min of the LSWT and its uncertainty
  3. Spatial plots of 5 lakes at different latitudes and of different sizes are examined.

Newly generated data that have validation properties similar to those found (and illustrated in Section 4) for the GloboLakes LSWT v4.0 are deemed to have satisfied the quality assurance and will be reported on in the Product Quality Assurance Report (PQAR).


References

Carrea, L., Embury, O. and Merchant, C. J. (2015) Datasets related to in-land water for limnology and remote sensing applications: distance-to-land, distance-to-water, water-body identifiers and lake-centre co-ordinates. Geoscience Data Journal, 2(2). pp. 83-97. ISSN 2049-6060 doi:10.1002/gdj3.32


This document has been produced in the context of the Copernicus Climate Change Service (C3S).

The activities leading to these results have been contracted by the European Centre for Medium-Range Weather Forecasts, operator of C3S on behalf of the European Union (Delegation Agreement signed on 11/11/2014 and Contribution Agreement signed on 22/07/2021). All information in this document is provided "as is" and no guarantee or warranty is given that the information is fit for any particular purpose.

The users thereof use the information at their sole risk and liability. For the avoidance of all doubt , the European Commission and the European Centre for Medium - Range Weather Forecasts have no liability in respect of this document, which is merely representing the author's view.

Related articles