Contributors:  W. Dorigo (WD, TU Wien), T. Scanlon (TS, TU Wien), Wolfgang Preimesberger (WP, TU Wien), Wouter Dorigo (WD, TU Wien), Richard Kidd (RK, EODC)

Issued by: EODC/R. Kidd

Date: 30/04/2020

Ref: C3S_312b_Lot4.D2.SM.1_20202504_PQAR_v1.0.docx

Official reference number service contract: 2018/C3S_312b_Lot4_EODC/SC2

Table of Contents

History of modification

Version

Date

Description of modification

Editor

1.0

09/05/2019

Updated for v201812 product.

TS, WP

2.0

30/04/2020

Updated for v 201912 product

WP, TS

Related documents 

Reference ID

Document

D1

W. Dorigo, T. Scanlon, P. Buttinger, A. Pasik, C. Paulik, R. Kidd, 2020. C3S D312b Lot 4.D3.SM.5 Product User Guide and Specification (PUG): Soil Moisture

D2

W. Dorigo, T. Scanlon, W. Preimesberger, P. Buttinger, A. Pasik, R. Kidd, 2020. C3S Product Quality Assurance Document (PQAD): Soil Moisture (v201912).

D3

R. de Jeu, R. van der Schalie, C. Paulik, W. Dorigo, T. Scanlon, A. Pasik, R. Kidd, C. Reimer (2020). C3S Algorithm Theoretical Basis Document (ATBD): Soil Moisture (v201912).

Acronyms 

Acronym

Definition

ABS

Scaled Absolute Values

AMI-WS

Active Microwave Instrument - Windscat (ERS-1 & 2)

AMSR2

Advanced Microwave Scanning Radiometer 2

AMSR-E

Advanced Microwave Scanning Radiometer-Earth Observing System

ASCAT

Advanced Scatterometer (Metop)

ATBD

Algorithm Theoretical Basis Document

AWST

Angewandte Wissenschaft Software und Technologie Gmbh

C3S

Copernicus Climate Change Service

CCI

Climate Change Initiative

CDF

Cumulative Distribution Function

CDR

Climate Data Record

CDS

Climate Data Store

CF

Climate Forecast

EC

European Commission

ECV

Essential Climate Variable

ECMWF

European Centre for Medium Range Weather Forecasting

EODC

Earth Observation Data Centre for Water Resources Monitoring

ERA

ECMWF Reanalysis

ESA

European Space Agency

FK

Fligner-Killeen

GCOS

Global Climate Observing System

GLDAS

Global Land Data Assimilation System

GPI

Grid point index (identifier for unique lon / lat combination)

HSAF

Satellite Application Facility on Support to Operational Hydrology and Water Management

HWSD

Harmonised World Soil Database

ICDR

Intermediate Climate Data Record

ISMN

International Soil Moisture Network

IQR

Interquartile Range

KPI

Key Performance Indicator

KS

Kolmogorov Smirnov

LSM

Land Surface Model

LOWESS

Locally Weighted Scatterplot Smoothing

LPRM

Land Parameter Retrieval Model

MERRA

Modern Era Retrospective-analysis for Research and Applications

NetCDF

Network Common Data Format

NRT

Near Real Time

PQAD

Product Quality Assurance Document

PQAR

Product Quality Assessment Report

PUG

Product User Guide

QA

Quality Assurance

QA4SM

Quality Assurance for Soil Moisture

RFI

Radio Frequency Interference

SMMR

Scanning Multichannel Microwave Radiometer

SMAP

Soil Moisture Active Passive

SMOS

Soil Moisture and Ocean Salinity

SSM/I

Special Sensor Microwave Imager

TMI

TRMM Microwave Imager

TU Wien

Vienna University of Technology

ubRMSD

unbiased Root Mean Square Difference

UNFCCC

United Nations Framework Convention on Climate Change

VOD

Vegetation Optical Depth

WGS

World Geodetic System

WindSat

WindSat Spaceborne Polarimetric Microwave Radiometer

General definitions 

Accuracy: The closeness of agreement between a measured quantity value and a true quantity value of a measure and ((JCGM) 2008). The metrics used here to represent accuracy are correlation and unbiased Root Mean Square Difference (ubRMSD). These metrics are commonly used throughout the scientific community as measures of accuracy (Entekhabi et al. 2010)

Bias: Estimate of a systematic measurement error ((JCGM) 2008).

Error: Measured quantity value minus a reference quantity value ((JCGM) 2008).

Precision: Closeness of agreement between indications or measured quantity values obtained by replicate measurements on the same or similar objects under specified conditions ((JCGM) 2008).

Quality Assurance: Part of quality management focused on providing confidence that quality requirements will be fulfilled (BSI 2015).

Stability: Property of a measuring instrument whereby its metrological properties remain constant in time ((JCGM) 2008). Note that for earth observation activities, "measuring instrument" can be translated as "retrieved variable" and "metrological properties" can be interpreted as "variability of retrieved variable". Alternatively, the Global Climate Observing System (GCOS) (WMO 2016) defines stability as the extent to which the uncertainty of measurement remains constant in time; the GCOS requirements are stated as the maximum acceptable change in systematic error (usually per decade).

Scope of the document

The purpose of this document is to describe the results of the Quality Assurance (QA) for the soil moisture product developed by TU Wien, EODC and VanderSat for the Copernicus Climate Change (C3S) service. The product version assessed in this report is v201912.0.0, which was produced in February 2020.

The C3S soil moisture product suite provides PASSIVE, ACTIVE and COMBINED (passive + active) microwave soil moisture products on a daily, dekadal (10-days) and monthly basis. The data is provided in a regular 0.25 degree grid based on the World Geodetic System 1984 (WGS 84) reference system. The product is available globally between November 1978 and present day (for PASSIVE and COMBINED) and 1991 and present day (for ACTIVE). For details about the products, we refer to the Product User Guide (PUG) (Dorigo et al. 2020a).

This document presents the results of QA activities that have been undertaken for the current Climate Data Record (CDR) dataset (v201912.0.0). The Intermediate Climate Data Record (ICDR) datasets are not currently assessed. However, note that, to achieve maximum consistency between CDR and ICDR, both products use the same Level 2 products (based on Near Real Time (NRT) data streams) and merging algorithms and thus have very similar quality characteristics.

A brief summary of the methodology used, described fully in the Product Quality Assurance Document (PQAD) (Dorigo et al. 2020b), is provided. The results described here are primarily for the COMBINED daily product, however, an assessment of the ACTIVE and PASSIVE products were undertaken and the main findings of these assessments are presented in the report.

Executive summary

The purpose of the Product Quality Assessment Report (PQAR) is to describe the product QA results for the soil moisture product developed by TU Wien, EODC and VanderSat for the C3S service. The production of the product has been funded by C3S, a service which is managed by the European Centre for Medium Range Weather Forecasting (ECMWF) on behalf of the European Commission (EC). The product version assessed in this report is v201912.0.0, which was produced in February 2020.

The document presents the results of the quality assessments undertaken for the product including the accuracy and stability assessment of the product. This document is applicable to the QA activities performed on the version of the CDR v201912.0.0. Currently, the assessment does not cover ICDRs but due to the high consistency between CDR and ICDR (both products use the same Level 2 products, based on NRT data streams, and merging algorithms), the QA assessment of the last years of the CDR can be readily transferred to the ICDR. 

The QA results broadly include the following parts: accuracy assessment, stability assessment, demonstration of uncertainty estimates, comparison to previous versions of the product, and a completeness / consistency assessment. The first two sections focus on demonstrating that the Key Performance Indicators (KPIs) set for the product are met. Note these KPIs take into account GCOS and user requirements for the product. In addition, in this version of the document, there is also a detailed assessment of the product against the previous version. 

Accuracy Assessment: In general, there is a slight variability in the correlation between the datasets, with correlations ranging from between 0.4 to above 0.8; depending on the conditions and the locations of the in-situ stations used. The ubRMSD, which can be directly taken as a measure of accuracy, is demonstrated to be below 0.10 m3 / m3 for all of the different conditions analysed. Therefore, the KPIs for accuracy have been met for in-situ observations. The global comparison against the Global Land Data Assimilation System (GLDAS) Noah v2.1 has shown expected results, with the KPI threshold of 0.1 m3 / m3 met in most regions (the exception being areas with high topographic complexity). However, for the ECMWF Reanalysis (ERA) 5 and ERA5-Land comparison there are some unexpected spatial patterns and in some (northern) areas, the KPI threshold target is not met.

Stability Assessment: The stability of the C3S product has been assessed in terms of the change in accuracy (when compared to ISMN network measurements). The accuracy between the products (ubRMSD) has been calculated per year, as well as trends in the median yearly accuracy. The KPI threshold for stability of 0.05 m³ / m³ / y is met when assessed using this method for all tested locations.
No algorithmic changes between v201812 and v201912 of C3S SM are introduced. The results for the two versions are therefore expected to be similar and any differences found are either due to issues in the processing chain, potential changes in input data streams or due to temporal extension of the CDR.

A comparison to previous products has been provided. The assessment demonstrates that the correlation between the in-situ and satellite-derived products is mostly unchanged, yet some differences are found in terms of data coverage in the COMBINED product, that also affect the validation metrics.

The spatial and temporal coverage of the product has been presented in terms of the number of valid (unflagged) observations available. It is shown that the coverage is better in Europe, Southern Africa and the contiguous United States (US) than in some other parts of the world. In addition, an increase in data coverage in the COMBINED product compared to v201812 is found in some areas, which indicates some minor data loss in the production of v201812 or a merging issue in the current version. 

Further, detailed assessment of the product has been undertaken, in particular for the ACTIVE and PASSIVE products as well as a detailed comparison against the previous product version. These assessments revealed that there are the following potential issues with the dataset:

  • Data increase in COMBINED product: More new observations than expected from the temporal extension of the data set are found in the COMBINED product when compared to v201812. This could indicate either undetected data loss in the previous version, or a potential (flagging) issue in the current version. Even though the first case seems more likely, it should be investigated further.
  • Missing uncertainty values in COMBINED and ACTIVE: The current version does not contain values in the "sm_uncertainty" variable before the year 2003 in the COMBINED and ACTIVE product. The missing values should be added to the data set before publishing.
  • PASSIVE time series drop: As already discovered for the previous version, there seems to be a significant drop in soil moisture values after 2011 for several land cover types. The reason for this is due to the scaling performed in the PASSIVE product, where no overlap between AMSR-E (the PASSIVE scaling reference) and AMSR2 is available. This issue has been accounted for in the later versions of ESA CCI SM (v5) by using data from the last 3 years of AMSR-E and the first 3 years of AMSR2 as input for scaling. This way the break is corrected. Potential issues with the approach could arise from extreme events in either 3-year subset or from trends during that time, which might be lost during the scaling process.
  • ACTIVE wetting trends: Artificial wetting trends in HSAF ASCAT SSM also affect the ACTIVE product of C3S SM after 2007 (and to some extend the COMBINED product). These trends are mostly likely caused by radio frequency interference (RFI) and appear especially in densely populated areas such as Europe, East Asia and parts of the United States. The trends, which are already present in ASCAT observed backscatter time series, are currently corrected in an experimental version of ASCAT SSM. Once an official release of HSAF ASCAT SSM incl. backscatter correction is released and used within the European Space Agency (ESA) Climate Change Initiative (CCI) SM dataset (currently expected for v6), also subsequent C3S versions will include the backscatter trend corrected active SSM measurements.
  • ERA5 assessment: The assessment against the ERA5 dataset shows – similar to results found for v201812 - ubRMSD higher than the threshold KPIs in some areas. Compared to findings in the PQAR for v201812, flagging of frozen soil conditions during validation is improved this time. Therefore, the affected areas were slightly reduced. The issue remains especially in some northern areas.

1. Product validation methodology

1.1. Validated products

C3S soil moisture provides passive (named PASSIVE), active (ACTIVE) and passive + active (COMBINED) soil moisture products on a daily, dekadal (10-days) and monthly basis. The time periods over which each sensor is used are provided in Table 1. The data is provided in a regular 0.25 degree grid based on the WGS 84 reference system. The product is available globally between November 1978 and present day (for PASSIVE and COMBINED) and 1991 and present day (for ACTIVE). The product has been produced by TU Wien, Vandersat, and EODC. The ASCAT SSM product mentioned in some parts of this report is developed and provided by H SAF.

Table 1: Blending periods for the soil moisture product (ACTIVE, PASSIVE and COMBINED)

Sensor

Time Period

ACTIVE PRODUCT

AMI-WS

1991-08-05 to 2006-12-31

ASCAT-A

2007-01-01 to 2012-11-05

ASCAT-A & ASCAT-B

2012-11-06 to 2019-12-31

PASSIVE PRODUCT

SMMR

1978-11-01 to 1987-07-08

SSM/I

1987-07-09 to 1997-12-31

[SSM/I, TMI, SSM/I]*

1998-01-01 to 2002-06-18

AMSR-E

2002-06-19 to 2007-09-30

AMSR-E & WindSat

2007-10-01 to 2010-01-14

AMSR-E & WindSat & SMOS

2010-01-15 to 2011-10-04

WindSat & SMOS

2011-10-05 to 2012-06-30

SMOS & AMSR2

2012-07-01 to 2019-12-31

COMBINED PRODUCT

SMMR

1978-11-01 to 1987-07-08

SSM/I

1987-07-09 to 1991-08-04

AMI-WS & SSMI

1991-08-05 to 1997-12-31

AMI-WS & [SSM/I, TMI, SSM/I]*

1998-01-01 to 2002-06-18

AMI-WS & AMSRE

2002-06-19 to 2006-12-31

ASCAT-A & AMSRE

2007-01-01 to 2007-09-30

ASCAT-A & AMSRE & WindSat

2007-10-01 to 2010-01-14

ASCAT-A & AMSRE & WindSat & SMOS

2010-01-15 to 2011-10-04

ASCAT-A & WindSat & SMOS

2011-10-05 to 2012-06-30

ASCAT-A & ASCAT-B & SMOS & AMSR2

2012-07-01 to 2019-12-31

*The [SSM/I, TMI, SSM/I] period is latitudinally divided into [90S, 37S] and [90N, 37N] for SSM/I, and the region in between for TMI.

The C3S soil moisture product is generated from a set of passive microwave radiometers and active microwave scatterometers. Radiometers include the Scanning Multichannel Microwave Radiometer (SMMR), Special Sensor Microwave Imager (SSM/I), TRMM Microwave Imager (TMI), Advanced Microwave Scanning Radiometer-Earth Observing System (AMSR-E), WindSat Spaceborne Polarimetric Microwave Radiometer (WindSat), AMSR2 and Soil Moisture and Ocean Salinity (SMOS). Scatterometer observations are collected by the Active Microwave Instrument - Windscat (AMI-WS) and ASCAT (Metop-A and Metop-B) sensors. The “ACTIVE product” and the “PASSIVE product” are created by fusing scatterometer and radiometer Level 2 soil moisture products, respectively; the “COMBINED product” is created by fusing Level 2 soil moisture products from both sensor types. Data files are provided in NetCDF-4 classic format as daily, dekadal and monthly images and comply with CF-1.8 conventions1.

A detailed description of the product generation of C3S v201912.0.0 is provided in the Algorithm Theoretical Basis Document (ATBD) (de Jeu et al. 2020) with further information on the product given in the PUG (Dorigo et al. 2020a). The underlying algorithm is based on that used in the generation of the publically released ESA CCI version 4.5 product which is described in relevant documents in Plummer et al. (2017), Wagner et al. (2012), Liu et al. (2012), and Dorigo et al. (2017a). In addition, detailed provenance traceability information can be found in the metadata of the product.

The C3S SM product comprises a long-term Climate Data Record (CDR) which runs from 1978 (PASSIVE and COMBINED) or 1991 (ACTIVE) to December 2019. This CDR is updated every dekad (approximately every 10 days) in an appended dataset called an Intermediate Climate Data Record (ICDR). The theoretical algorithm and the processing implemented in the CDRs and ICDRs are exactly the same and the data provided is consistent between them.

This current document is applicable to the QA activities performed on the version of the CDR v201912.0.0 (produced in February 2020). The COMBINED daily product is the focus of the validation activities presented in this report. However, the ACTIVE and PASSIVE daily products have also been validated and a summary of the outcomes is provided in Annex A. Annex B contains validation results for the dekadal and monthly products.

1 CF conventions: www.cfconventions.org

1.2. Description of validating datasets

A combination of in-situ and global reference datasets are utilised to assess the quality of the C3S soil moisture product. A list of the datasets utilised is provided in Table 2 with further details provided in the PQAD (Dorigo et al. 2020b).

Table 2: Datasets utilised in the assessment of the data product

Dataset Name

Description

International Soil Moisture Network (ISMN)2

A centrally hosted database where globally available in-situ soil moisture measurements from operational networks and validation campaigns are collected, harmonised, and made available to users (Dorigo et al. 2011). The data available within the ISMN is subject to quality controls (detailed in (Dorigo et al. 2013)) and provided with quality flags. The quality controls include an assessment against a possible range of important metrological variables which are applied equally to all datasets.

ERA5

The ERA5 dataset3 produced by ECMWF is available from 1979 to within 3 months of real time. The data provided includes surface soil moisture at up to hourly intervals on a 30 km resolution.

ERA5-Land

ERA5-Land provides land variables with an increased spatial resolution compared to ERA5. Soil Moisture in ERA5-Land is available in 6H intervals on a ~9km resolution.

Modern Era Retrospective-analysis for Research and Applications 2 (MERRA2)4

MERRA-2 (Gelaro et al. 2017) is a replacement for the MERRA (Rienecker et al. 2011). It provides data starting in 1980 with a spatial resolution of about 50 km. The MERRA-2 dataset differs from the original MERRA dataset as it incorporates advances made in the assimilation system enabling the assimilation of hyperspectral radiances and microwave observations.

ESA CCI Soil Moisture

The CCI project was initiated in 2009 by ESA in response to the United Nations Framework Convention on Climate Change (UNFCCC) and GCOS needs for Essential Climate Variable (ECV) databases (Plummer et al. 2017). In 2012, ESA released the first multi-decadal, global satellite-observed soil moisture dataset, named ESA CCI SM, combining various single-sensor active and passive microwave soil moisture products (Dorigo et al. 2017a). The C3S product is based (scientifically, algorithmically and programmatically) on version v4.5 of the ESA CCI SM product.

1.3. Description of product validation methodology

1.3.1. Method Overview

The methodology used in the assessment of the soil moisture product is described in Dorigo et al. (2017b). The methodology, including details of the validated products and the validating datasets are described briefly here in the relevant sections; conversely, a summary of the validation results may be found in (Dorigo et al. 2019) for the previous dataset version (v201812).

The quality assessment includes the following:

  1. Assessment against in-situ observations from the ISMN (Section 2.1),
  2. Assessment against Land Surface Models (LSMs) including GLDAS,ERA-5 and ERA5-Land (Section 2.2),
  3. Stability analysis through monitoring of accuracy trends and dataset statistics (Section 2.3),
  4. Stability analysis through breakpoint detection (Section 2.4),
  5. Assessment of the spatial and temporal completeness and consistency of the product (Section 2.5),
  6. Analysis of time series at selected locations (Section 2.6), and,
  7. Analysis of the uncertainty information provided with the dataset (Section 2.7).


Additional information is also provided in the Annexes of the report:

  1. Annex A: The main body of the report considers mainly the COMBINED CDR, therefore, a separate Annex provides information on the validation results of the ACTIVE and PASSIVE products; summarising the key findings from these activities.
  2. Annex B: The main body of the report provides some comparison between the current C3S version (v201912) and the previous version (v201812). Annex B provides a more complete analysis to demonstrate that the data has been made correctly and show any differences between the products.

1.3.2. Quality Assurance for Soil Moisture

The analysis presented in this report has been undertaken using software specifically designed for the C3S soil moisture project. However, in parallel, an online validation service - Quality Assurance for Soil Moisture (QA4SM) - has been developed which undertakes similar tasks.

The QA4SM service is an online validation service, which allows the traceable validation of satellite derived soil moisture products. Currently supported datasets include C3S SM, SMAPL3 36km SM, the H-SAF ASCAT SSM, the ESA CCI SM product and the SMOS IC product. The comparisons can be carried out against ISMN, GLDAS, ERA5 and ERA5-Land reference data. The service provides different options for the filtering of datasets and for scaling the datasets to each other. CDR v201912 of C3S SM will be added to the service when officially published.

The QA4SM service is continuously under development and does not currently have all of the features required to undertake a full validation of the C3S data as required for the validation activities presented in the current document. As the QA4SM service develops, the analysis presented in this report will be updated with results from this system at the generation of new product versions.

1.3.3. Planned enhancements to the method

This section briefly details enhancements to the methods envisaged for the future. They will potentially be implemented at the generation of new product versions.

Committed area mask
In future version of this assessment, it is proposed that a committed area mask, akin to that used in the validation of the H-SAF soil moisture project (see Figure 1). This mask is used to better understand the performance of the product globally. This mask is based on several criteria and thresholds, each of which will be considered in the generation of a C3S specific committed.

Figure 1: Committed area used by the H-SAF soil moisture project. The mask is based on a combination of vegetation optical depth thresholds, snow / ice masking and land cover types.

Consistent masking of the validation datasets
Currently, the masking of the datasets used in the validation is undertaken dependent on the flags available within the separate data products. For example, the GLDAS masking is done based on no snow cover flags. However, between the different validations there is no consistency on which locations / timestamps are masked. The introduction of consistent masking would help improve the inter-comparison of the validation results using data from different sources.

Assessment of the discontinuities in ERA5
It is known that there are discontinues in the ERA5 soil moisture variables due to the splitting of the processing in to chunks. The assessment presented here shows these discontinuities appear to not affect the upper most soil layer and therefore, the assessment of the C3S product against this data is appropriate. However, this assertion has been made solely on the visual interpretation of the ERA 5 time series and further analysis should be carried out to determine if this is the case.

2. Validation results

2.1. Accuracy – Comparison against ISMN

2.1.1. Introduction

The C3S dataset has been compared to the ISMN dataset and the correlations and ubRMSD between the datasets calculated.

In addition to an overall comparison processed using the QA4SM service (Section 2.1.3), the comparison is also undertaken for different attributes of the soil moisture data (provided as metadata within the ISMN dataset). These are: sensor depth (Section 2.1.4), soil texture (Section 2.1.5), Köppen-Geiger climate classes (Section 2.1.6) and land cover (Section 2.1.7). For further information on the origin of these attributes, see Dorigo et al. (2011). 

The aim of assessing the accuracy for different attributes is to demonstrate the performance of the product under different conditions and demonstrate that KPIs DX.1 (where X is equal to 1 – 6) are met. See the PQAD (Dorigo et al. 2020b), Table 2, for the list of KPIs; repeated in this document at Table 6.

2.1.2. ISMN data and comparison settings

The ISMN data used in the assessments presented in this document was downloaded from the ISMN data portal5 on 2019-12-11 (hence this is referred to as v20191211). The full list of the networks used in the assessment can be found in the PQAD (Dorigo et al. 2020b). The dataset consists of up to 571 stations within 25 networks as shown in Figure 2.

 

Figure 2: Potential ISMN networks and stations used for validation. 571 stations in up to 25 networks are considered in the validation process (0-5 cm depth). Note that the station selection for validation may vary depending on the availability of C3S SM, the validation time period and the measurement time of each ISMN station.

Where there is more than one station available within a grid cell, a simple average of the station data is taken prior to comparison to the C3S time series for all assessments except the overall comparison. In the overall comparison (processed through the QA4SM service) no averaging of the ISMN time series is undertaken. Instead, a comparison is undertaken for each ISMN station against each nearest neighbour C3S GPI; hence in these results some GPIs are over represented. 

To calculate the metrics for each assessment, the settings summarised in Table 3 are used. In summary, the datasets are spatially and temporally matched, filtered for high quality observations and the C3S data is scaled to the ISMN data using mean – standard deviation scaling. The metrics are then calculated using observations in the period 1978-11-01 to 2019-12-31.  

Table 3: Settings used in the assessment of the C3S soil moisture product against the ISMN

Setting

Details

Temporal Matching

A temporal window of 12 hours is used to find matches between the C3S and ISMN datasets, i.e. the ISMN may be from 6 pm to 6 am around midnight UTC (timestamp for each C3S daily image).

Spatial Matching

The nearest land grid point index from the grid C3S data is found using the lon/lat of the ISMN station metadata. Where a single C3S GPI is associated with more than one ISMN station, an unweighted average of the ISMN data is taken (i.e. average for each timestamp) prior to calculation of the metrics.

Scaling

The C3S data is scaled to the reference data (ISMN) using mean – standard deviation scaling.

Filters

The ISMN data has been filtered on the "soil moisture_flag" column such that only observations marked "G" are utilised6 (Dorigo et al. 2013). The depths of the ISMN sensors used is usually 0 – 5 cm (with the exception of the depth analysis presented in Section 2.1.4.
The C3S data has been filtered on the "flag" column such that only observations flagged with "0" are utilised. These are observations which are considered good, i.e. no other processing flags have been raised on these observations.

5 ISMN data portal: http://www.geo.tuwien.ac.at/insitu/data_viewer/ISMN.php

6 More information on the ISMN quality flags can be found here: https://ismn.geo.tuwien.ac.at/en/data-access/quality-flags/

2.1.3. Overall Comparison

The comparison of C3S v201912 has been processed using the QA4SM method against ISMN v20191211. The global map (Figure 3) shows the ubRMSD for each ISMN station to the nearest C3S grid cell; correlation (Pearson's) is shown in Figure 4. These figures show the expected spatial patterns, with high correlations and low ubRMSD seen at most ISMN locations. 

Figure 3: ubRMSD between C3S v201912 COMBINED and ISMN v20191211 for soil depths of 0 – 5 cm (left) and comparison with C3S v201812 COMBINED (right).

Figure 4: Correlation (Pearson's) between C3S v201912 and ISMN v20191211 for soil depths of 0 – 5 cm (left) and comparison with C3S v201812 COMBINED (right).

2.1.4. Soil depth

The correlation between the C3S dataset and the in-situ datasets are presented in Figure 5 for two different surface soil moisture depths (0-5 cm and 5-10 cm).

The deeper sensors have a lower correlation and higher ubRMSD than the sensors at shallower depths. This is as expected as the product represents the first few centimetres of the soil surface (approx.). However, it may also be attributed to the decreasing errors associated with the ISMN data at these deeper depths - noting that the relation seems to break at depths > 1.00 m (Gruber et al. 2013).

 

Figure 5: Correlation (left) and ubRMSD (right) between C3S v201912 and ISMN for soil depths of 0 – 5 cm and 5 – 10 cm. The boxplots show the median and interquartile range.

2.1.5. Soil texture

The correlation and ubRMSD between the C3S dataset and the in-situ datasets are presented in Figure 7 for the different soil textures (fine, medium and coarse) (stratification provided from the ISMN dataset (Dorigo et al. 2011) and shown in Figure 6).

The product appears to perform best for medium texture soils in terms of correlation but worse in terms of ubRMSD for the upper layer (0-5 cm). The coarse texture soils have very few observations available, therefore the results for this soil texture are not considered reliable.

Figure 6: Soil texture classification used for the ISMN comparison. The percentage clay, silt and sand are taken from the ISMN metadata which in turn is retrieved from the HWSD.


Pearson's R

ubRMSD

0-5 cm

5-10 cm

Figure 7: Pearson's R (left) and ubRMSD (right) between C3S v201912 and ISMN for different soil texture classes in 0-5 cm (top) respectively 5-10 cm (bottom) depth. The boxplots show the median and interquartile range.P

2.1.6. Köppen-Geiger climate classes

The correlation and ubRMSD between the C3S dataset and the in-situ datasets are presented in Figure 8 for different Köppen-Geiger classes (BSx7, Csx / Dsx8 and Cfx / Dfx9), a global map of which is shown in Figure 9 - stratification provided from the ISMN metadata (Dorigo et al. 2011).

The correlation between the in-situ measurements and the C3S product is similar across all Köppen-Geiger classes, with Csx / Dsx performing the best in terms of correlation and second best in terms of ubRMSD. The ubRMSD can be taken as a measure of accuracy; the graphs show that in this case, the mean value for all of the different climate classes are under the 0.1 m3 / m3 threshold set out in the KPIs; therefore the KPIs are achieved. 


Pearson's R

ubRMSD

0-5 cm

5-10 cm

Figure 8: Correlation (Pearson's, left) and ubRMSD (right) between C3S v201912 and ISMN for different climate classes in 0-5 cm (top) respectively 5-10 cm (bottom) depth. The boxplots show the median and interquartile range.

Figure 9: Köppen-Geiger classes. The classes used in this assessment are BSx, Csx / Dsx and Cfx / Dfx. The Figure is taken from http://hanschen.org/koppen.

7 BSx (Arid–Steppe)

8 Csx / Dsx (Temperate-Dry Summer/Cold-Dry Summer)

9 Cfx / Dfx(Temperate-Without Dry Season/Cold Without Dry Season))

2.1.7. Landcover classes

The correlation between the C3S dataset and the in-situ datasets are presented in Figure 10 for five different land cover classes (cropland, grassland, tree-cover, urban areas, other); stratification provided from the ISMN dataset (Dorigo et al. 2011).

The ubRMSD can be taken as a measure of accuracy and the KPIs specially state the acceptable accuracy level of the product is between 0.01 and 0.1 m3 / m3 for different land cover types. The worse-case ubRMSD shown here is under 0.1 m3 / m3 (median and IQR) for tree cover. Therefore overall the KPIs are met.


Pearson's R

ubRMSD

0-5 cm

5-10 cm

Figure 10: Correlation and ubRMSD between C3S v201912 COMBINED and ISMN for different land cover classes. The boxplots show the median and interquartile range.
 

2.1.8. Comparison to previous versions

A comparison of the correlation between satellite SM and in-situ observations is provided in Figure 11 for different versions of the CCI product (including v04 upon which the C3S product v201912 is based). It can be seen that on average the correlation between the in-situ and satellite derived soil moisture datasets improves with the development of the product.

In the comparison, the correlations are shown for different periods. Figure 11shows that the correlation between the products and the ISMN data is better for the later periods. This is likely due to the increased observations available as input to the CCI products within this time period. 

Further information on the assessment of previous CCI versions against reference data is available. Dorigo and Gruber (2015) provides an extensive evaluation against v0.1, employing all useable observations from the ISMN (Dorigo et al. 2017a). Another assessment (Albergel et al. 2013) also considered v0.1 but in relation to the ERA-Interim / Land dataset. Version 2.2 has also been subject to validation against in-situ observations (Fang et al. 2016). 

In general, the products were deemed to agree well with in-situ observations, but lack behind the performance of those obtained for LSM simulation integrating observed precipitation such as ERA5-Land. This may be due to the discrepancy between the installation depth of the in-situ probes (typically 5 cm) and the typical depth of ~2 cm represented by the satellite-derived datasets (Dorigo et al. 2017a). 

 

Figure 11: Correlation between the CCI product (different versions) and ISMN data. Includes v04 of ESA CCI SM upon which C3S v201912 is based. The results are shown for all seasons, for different three-year periods between 1997 and 2016 (first five panels) and in the final panel for the entire time period of the each product.

To demonstrate the differences between the previous C3S version (v201812) and the current dataset (v201912), a summary of the comparisons of the dataset to the ISMN data is shown in Table 4. 

The summary of the results show that the comparison to ISMN for the overall mean metrics (not split by land cover or climate classes) is very similar for the different dataset versions for the case of both the latest merging period and the complete data period.  

Table 4: Results of comparison against ISMN (0-5 cm) for different C3S dataset versions.

Metric

Time period

ACTIVE

PASSIVE

COMBINED

v201812

v201912

v201812

v201912

v201812

v201912

Correlation (Pearson’s)

Latest merging period (2)

0.478

0.475

0.544

0.54

0.562

0.56

Complete period (3)

0.478

0.475

0.532

0.529

0.552

0.545

ubRMSD

Latest merging period (2)

0.091

0.091

0.059

0.060

0.088

0.086

Complete period (3)

0.091

0.091

0.061

0.062

0.086

0.094

(1) For all comparisons, v20191211 of the ISMN dataset is used (see Section 2.1.2 for further details).
(2) For ACTIVE this is the ASCAT period (from 2007-01-01 onward); for PASSIVE this is the SMOS / AMSR2 period (from 2012-07-01 onward); for COMBINED this is the ASCAT / SMOS / AMSR2 period (from 2012-07-01 onward).
 (3) For ACTIVE this is from 1991-01-01 onward and for PASSIVE and COMBINED after 1978-11-01.

Figure 12: Correlation (Pearson's) for C3S v201812 (left) and C3S v201912 (right) with ISMN (0 – 5 cm sensor depth) for the COMBINED product split by land cover classes. Comparison is for the time period after 2012-07-01 (ASCAT, SMOS and AMSR2 period). The boxplots show the median value and interquartile range.

Figure 13: ubRMSD for C3S v201812 (left) and C3S v201912 (right) with ISMN (0 – 5 cm sensor depth) for the COMBINED product split by land cover classes. Comparison is for the time period after 2012-07-01 (ASCAT, SMOS and AMSR2 period). The boxplots show the median value and interquartile range.

2.2. Accuracy – Comparison against Land Surface Models

2.2.1. GLDAS v2.1

C3S v201912 has been compared against GLDAS v2.1 (from 2000-01-01 to 2019-12-31 - the end of the C3S v201912 product). The spatial distribution of the correlation (Pearson's) and the ubRMSD are shown in Figure 14 and Figure 15 respectively. This comparison has been generated using the QA4SM service. Note that the GLDAS data is masked in this case for snow covered conditions. Figure 14 shows that the correlation between GLDAS v2.1 and the COMBINED product show expected spatial patterns. There is a high positive correlation between the products in most temperate zones, with low and near-negative correlations being most prevalent in the northern, boreal regions. The deserts show little to no positive or even negative correlation. 

Figure 14 - Correlation (Pearson's) of the C3S v201912 COMBINED product with GLDAS v2.1 (covering the time period after 2000-01-01) and comparison with the previous C3S version (right).

Figure 15 shows, that the ubRMSD between GLDAS v2.1 and the COMBINED product is low in most regions, with a significant portion being below the threshold required in the KPIs (see Section 4). Again, there appears to be a lower agreement between the products in the boreal regions, as is apparent also in the correlation map.

 Figure 15: ubRMSD of the C3S v201912 COMBINED product with GLDAS v2.1 (covering the time period after 2000-01-01). The threshold for the KPI (0.1 m3 / m3) is shown in purple (left). Comparison with the previous C3S version (right).

To compare the results for the current and previous version spatially (v201912 vs. v201812) a comparison of the results for correlation (Figure 16) and the ubRMSD (Figure 17) with GLDAS has been undertaken. 

Correlation improvements are shown in blue. In general, only very small differences between the two versions are found as expected (no algorithmic changes). A slight degradation in R is found in parts of Sahara, which might me related to more observations being present in the current version of the COMBINED product (compare to Section 2.5) 

For ubRMSD improvements are also shown in blue. There are virtually no differences compared to the previous dataset version in terms of ubRMSD. Further analysis against GLDAS has been undertaken in the study for the ACTIVE and PASSIVE products in Section 5. 

Figure 16: Difference between the correlation for C3S v201912 and GLDAS v2.1 and C3S v201812 and GLDAS v2.1. (v201912 correlation minus v201812 correlation; blue represents an improvement in the correlation).

Figure 17: Difference between the ubRMSD for C3S v201912 and GLDAS v2.1 and C3S v201812 and GLDAS v2.1. (v201912 ubRMSD minus v201812 ubRMSD; blue represents an improvement in the ubRSMD).

2.2.2. ERA5

The C3S SM v201912 product has been compared to the ERA5 product10. ERA5 covers the period 1979-01-01 to within 3 months of present day and provides soil moisture data for different depth layers. Here "swvl1" is used which represents the first 7 cm of the soil. The correlation (Pearson's) (Figure 18) and ubRMSD (Figure 19) between C3S v201912 and ERA5 are presented. The boxplots show intercomparison results to the previous version of C3S SM. 

10 Note: Previously, the ERA-Interim / Land product has been used in the validation of the C3S SM product. This assessment is no longer presented here and has been replaced with that for ERA5.

The correlation map (Figure 18) shows expected patterns, which are similar to those shown for GLDAS v2.1 (see Figure 14). The exception being the larger area in the North East, North West and parts of Sahara, where in some locations correlations coefficients close to zero or even negative ones are found. Apart from that, high correlations are found in most regions, resulting in a global median of around 0.5. 

Figure 18: Correlation (Pearson's) of the C3S v201912 COMBINED product with ERA5 (left, covering the time period after 1979-01-01). Comparison with v201812 COMBINED (right).

 The ubRMSD (Figure 19) with ERA5 as the reference shows higher values than the comparison with GLDAS in some areas, with the KPI threshold of 0.1 m3 / m3 being breached in the North and some areas of high topographic complexity. The global median ubRMSD is found at 0.059, the IQR of all observations is still below the threshold of 0.1 m3 / m3. Considering the good performance of GLDAS v2.1 in terms of this metric, a clear conclusion of the C3S SM products in these areas cannot be made. GLDAS is used for scaling in the COMBINED product (that being assessed here); hence some of the characteristics of GLDAS are transferred to the C3S product and might cause a higher correspondence in the validation process. 

Figure 19: ubRMSD of the C3S v201912 COMBINED product with ERA5 (left), covering the time period from 1979-01-01 onward and comparison with the previous version (right).

To assess the differences between the current (v201912) and previous version (v201812) of the dataset spatially, the results of the comparison against ERA5 for both products are compared for the correlation (Figure 20) and ubRMSD. 

In Figure 20, the improvements in the correlation between the products are shown in blue, with worse performance shown in red. Same as for the comparison to GLDAS, only small differences are found within the COMBINED products, which are probably related to slightly more observations being available in some areas in the new version. Virtually no  differences are found in terms of ubRMSD compare to the previous version, therefore the ubRMSD difference map is omitted here.

 
Figure 20: Difference between Pearson's R for C3S v201912 and ERA5 and C3S v201812 and ERA5 (blue represents an improvement in the correlation). 

It is known that there are discontinuities in the ERA5 product Discontinues in the ERA 5 time series are discussed here: https://confluence.ecmwf.int/pages/viewpage.action?pageId=100045763 , which can be clearly seen in the example time series presented in Figure 21 for the lower depths. For example in the top time series, a break in layer 4 can be clearly seen in 2015. However, it should be noted that there is no discernible break in the time series for the upper most soil layer (swvl1). However, this assertion is made solely from visual interpretation of the time series. Further assessment should be undertaken to determine if this assertion holds true throughout the data product. 

 

Figure 21: Example time series for volumetric soil water content in the ERA5 product with different depth layers shown in different colours. Two example locations are shown: (top) a desert site in Algeria and a desert site in Mauritania (bottom).

2.2.3. ERA5-Land

In addition to ERA5, global validation was also performed using the new ERA5-Land dataset (in the period after 2001-01-01). Results are similar to those obtained from ERA5. Figure 22 shows the according correlation map and the box plots. 

Figure 22: Correlation (Pearson's) of the C3S v201912 COMBINED product with ERA5-Land (left, covering the time period after 2001-01-01). Comparison with v201812 COMBINED (right).

Figure 23 shows the analysis for ubRMSD. The observations that were made for ERA5 are also found here. Overall, a slightly higher correlation is found with ERA5-Land compared to ERA5, caused by the higher spatial resolution of ERA5-Land and/or the shorter validation period. 

Figure 23: ubRMSD for C3S v201912 COMBINED product with ERA5-Land (left, covering the time period after 2001-01-01). Comparison with v201812 COMBINED (right).
 

2.3. Stability – Trend monitoring

Methods for monitoring the stability of the dataset are still under development. Here, preliminary results are presented to demonstrate the methods developed thus far along with a discussion of how they will be developed for future versions of the product assessment.

2.3.1. Accuracy evolution

To assess the evolution of the C3S SM data set quality over time a preliminary accuracy evolution analysis with ISMN reference measurements over up to 19 years is performed. Figure 24 shows the evolution of Pearson's R based on the land cover classifications for ISMN stations used in the evaluation procedure. It shows that the quality of C3S SM (COMBINED) varies slightly, depending on land cover. Another important factor to consider in this comparison is the number of ISMN stations available in each year, which is represented by the small number below the boxes. Notably there are more ISMN stations available over time. It can be seen that the product is most stable for "Grasslands", while for "Tree Cover" there is visible variation in the product before the introduction of ASCAT (2007). "Cropland" shows a slight decline in R over time and a more stable product in the merging period after 2012 compared to the years before. A drop in R is found for 2019, which might be due to less stations being used in the assessment for that year.

Similar observations can be made in terms of ubRMSD. Here the expected wide spread of error for the "Urban areas" class is also visible. This is probably caused by the spatial resolution of C3S SM, where soil moisture networks close to densely populated areas are less representative of the whole C3S SM cell, as satellite SM in these areas can be affected by landcover changes (city growth) and RFI.
The COMBINED product of C3S SM is below the ubRMSD KPI threshold of 0.1 m3/m3 in terms of median and IQR. For landcover classes "Grassland" and "Tree Cover" few points are found outside of this threshold.



Cropland

Grassland

Tree Cover

Urban areas

Pearson's R

ubRMSD

Figure 24: Accuracy evolution of C3S v201912 COMBINED between 2000 and 2020 in terms of Pearson's R (top) and ubRMSD (bottom); based on land cove classes. Numbers at the bottom indicate the number of ISMN stations used in the comparison.

A similar analysis was performed based on four groups of climate classes. Figure 25 shows that the performance is most stable for the "BS" (Arid-Steppe) and "Cf / Df" (Temperate-Without Dry Season/Cold Without Dry Season) classes. Larger variation is found for the "Cs / Ds" (Temperate-Dry Summer/Cold-Dry Summer) and the remaining classes ("Other").

The COMBINED product of C3S SM is below the ubRMSD KPI threshold of 0.1 m3/m3 in terms of median and IQR. For climate classes "C/D" (and their subclasses) single points are found outside of this threshold. 


BSh/k

Cfa/b/c & Dfa/b/c

Csa/b/c & Dsa/b/c

Other

Pearson's R

ubRMSD

Figure 25: Accuracy evolution of C3S v201912 COMBINED between 2000 and 2020 in terms of Pearson's R and ubRMSD; based on climate classes. Numbers at the bottom indicate the number of ISMN stations used in the comparison.

The presented accuracy evolution results are the baseline for estimation of stability trends in the data set. Theil-Sen slopes were fitted to the median annual ubRMSD estimates. The so found slope is supposably representative of the overall stability of the product. Note that this approach is currently under development and might change for future validation activities. Figure 26 shows the distribution of trends for difference classes. It can be seen that at the tested locations for all landcover and climate classes the change in accuracy metrics over time is very small, which indicates a stable SM product. 

Figure 26: Distribution of trends in ubRMSD in C3S SM v201912 COMBINED for different landcover (top) and climate (bottom) classes, tested against ISMN stations where at least 3 years of accuracy evolution assessment was possible.

2.4. Stability – Breakpoint detection

A procedure has been developed at TU Wien to test for potential inhomogeneities in the SM CDRs (Preimesberger et al.). Breaks may occur as a result of merging different sensor combinations over time, as shown in Figure 27. Such breakpoints may therefore appear between periods with different input sensors. Structural inhomogeneities may affect statistics such as trends and changes in extreme values (percentiles) and therefore should not only be detected but also corrected. 

Figure 27: Potential break times in the ESA CCI SM product (version 04.5) corresponding to changeovers in the sensors.

Work to optimise the break-point detection has been ongoing; the use of non-parametric statistical tests (Fligner-Killeen test for homogeneity of variances and Wilcoxon rank-sums test for shifts in population mean ranks) against reference datasets has proved successful. The methods have been demonstrated using both in-situ measurements from the ISMN as well as from surface model simulations from MERRA2 SM (Su et al. 2016) as a reference. 

Due to their similarities, algorithms developed for ESA CCI SM are directly applicable for C3S SM products. Break detection and correction will be included in future versions of ESA CCI SM and subsequently also in C3S SM. 

At tested sensor transitions, there are significantly more mean breaks detected than variance breaks. The spatial patterns of the breaks vary between the break times, with Africa showing the most consistent set of locations where a break is detected. 

The highest number of breaks appears to occur in 2007, which coincides with the introduction of the ASCAT-A sensor in to the time series. However, the introduction of AMSR2 in 2002 also seems to have a significant impact especially in Central Asia.  

The observed breaks are likely to be caused by one of two elements of the processing algorithm:

  • The scaling of the datasets to a common reference can result in unrealistic jumps in the time series.
  • The error characterisation used to generate the merging weights for the product are fixed per location for each sensor period and does not take in to account changes in the sensors over these periods. This is a product of how the triple collocation used to characterise the errors works.

To address the first point, an enhanced scaling method, which considers the upper and lower percentiles in a better way, will be implement. To address the second point, in future versions of the algorithm, a temporally evolving error characterisation will be investigated.

No adjustment of the breaks is undertaken for the current product. However, in the future, this will be implemented as part of the algorithm. To adjust detected breaks in the data set, three methods have been investigated:

  1. LMP – Linear Regression Model Pair Matching: Differences in parameters of two linear regression models of data (before resp. after break) used to derive corrections for SM observations before the break.
  2. HOM – Higher Order Moments: A higher order polynomial regression model from observations after a break is used to create homogeneous predictions before the break. Corrections are derived from Locally Weighted Scatterplot Smoothing (LOWESS) fitted differences in quantiles of the observed and predicted ESA CCI SM values before the break. Quantiles are derived using L-moments statistics and KS testing.
  3. QCM – Quantile Category Matching: Spline-fitted differences in empirical CDFs of the ESA CCI SM and reference SM (between quantile categories, i.e. average SM within a number of quantile ranges) values before, resp. after a break are used to find corrections for quantiles of ESA CCI SM before the break.


Figure 28: Breaks detected in the ESA CCI SM v4.5 COMBINED product at 2012-07-01 (top). Detected breaks at the same time after correction with QCM (bottom). Bright green shows variance breaks, whilst mean breaks are shown in red and a combination of the two are shown in blue.

Adjustment is performed iteratively, with the goal that across each detected break, changes in means and variances are matched to follow changes within the reference data set (relative bias correction) and homogenised observation series (with respect to the reference data set) are derived. Correcting breaks should increase the lengths of homogeneous periods in the product as shown in Figure 29. 

Figure 29: Increase in length of the longest homogeneous period in ESA CCI SM v4.5 (COMBINED) after correction with QCM.

2.5. Spatial and temporal completeness

It is important to consider the spatial and temporal completeness and consistency of the product as these can be a key deciding factor for the users in terms of whether or not the product is suitable for their application.

The spatial and temporal coverage of the product is presented (shown in Figure 30 and Figure 31) in terms of the number of valid (unflagged) observations available. The figures show that the coverage is better in Europe, South Africa and the contiguous US than some in other parts of the world as well as the improvement in the availability of data post-2007 as new sensors became available (see Figure 27 above for further details of sensor periods). This is as expected for the product due to the orbital paths of the satellites resulting in higher coverage in equatorial regions. The reduced coverage in boreal and tropical region is as expected due to the high Vegetation Optical Depth (VOD) expected in these areas. In addition, the coverage of the northern most latitudes in snow and ice for long periods also reduces the coverage in these areas. 

 

Figure 30: Fractional coverage of the C3S SM v201912 COMBINED product for the ASCAT / SMOS / AMSR2 period (2012-07-01 to 2019-12-31). Expressed as the total number of daily observations per time period divided by the number of days spanning that time period.

Figure 31: Fraction of days per month with valid (i.e. unflagged) observations of SM for each latitude and time period for the v201912 COMBINED product.

2.6. Time series analysis

Analysis of time series from a small number of locations provides an insight into the behaviour of the product for different climate and land cover types. Five points have been chosen for which ISMN in-situ data is available (and were used in the above assessment). Details of the points are provided in Table 5 and are shown on a global map in Figure 32.

Table 5: Details of locations chosen for time series analysis.

#

Ancillary

C3S data location

ISMN station location

Climate class

Land cover class

Country

GPI

Lat

Lon

Lat

Lon

1

Dsc

Sparse vegetation

USA

890047

64.625

-148.125

64.7232

-148.151

2

Cfa

Cropland

Australia

316669

-35.125

147.375

-35.1249

147.4974

3

BSk

Cropland

Spain

756697

41.375

-5.625

41.2747

-5.5919

4

Cfb

Grassland

Germany

810025

50.625

6.375

50.5149

6.3756

5

Cfa

Broadleaf forest

USA

733335

37.375

-86.125

37.2504

-86.2325

Note: all are classified as having 'medium' soil texture.

 Figure 32: Locations of the points used in the time series comparison (points are given at the C3S GPI location).

The time series (temporally aggregated per month) for the individual locations for the ACTIVE, PASSIVE and COMBINED products are given in Figure 33. In general, the time series appear to follow expected seasonal cycles at each location, i.e. winters are wetter and summers drier and, in the case of GPI 890047 (which is located in Alaska), there are gaps in the data where the location is covered by snow each winter.

Of note for these time series, is the sudden drop in the values associated with the PASSIVE product around 2011, which can be seen in all but the "Cropland" land cover type. Whilst this is not so pronounced for the "Grassland" point, it is clear in both the sparse vegetation and broadleaf forest. This drop coincides with the drop out of AMSR-E and is due to there being not overlap with the AMSR2 sensor (utilised in the period after 2012-07) for scaling, which causes a break in the PASSIVE data set. It is noted that the COMBINED product does not suffer such a marked changed despite including the same sensors; this difference is due to the scaling of the data to GLDAS v2.1 and due to the inclusion of the active sensors in this time period in the COMBINED data. Further information on this aspect is provided at Annex A. The issue could be resolved by adding a bridging data set to C3S SM (currently FengYun SM (Yang et al. 2011) is considered) in future versions. Until then the issue can be reduced by matching non-overlapping periods of AMSR-E and AMSR2, although this approach probably leads to the mitigation of trends in the affected period and is therefore only considered an intermediate solution.

Figure 33: Time series comparison for the COMBINED, ACTIVE and PASSIVE products of C3S v201912 for the GPIs and land cover types stated for each plot (compare with Figure 32). Note: here, the ACTIVE product is divided by 100 to allow it to be plotted on the same axis as the other products.

2.7. Uncertainty analysis

The algorithm used to develop the C3S soil moisture product utilises triple collocation analysis to generate weightings for the combination of different soil moisture observations (Gruber et al. 2017). The SNR calculated as part of the triple collocation process is used to weight the sensors within that merging period. In combination with error propagation techniques, a per-pixel uncertainty is provided within the C3S soil moisture product in the "sm_uncertainty" field.

The weights used within the ASCAT / SMOS / AMSR2 period is shown in Figure 34. These weightings show that ASCAT performs best in highly vegetated areas such as the boreal forests in the north whereas AMSR2 performs best in dry regions.

Figure 34: Weightings used for merging ASCAT (red) / SMOS (blue) / AMSR2 (green) within the C3S v201912 product. These maps are spatially gap filled by regression analysis between the signal-to-noise ratio used in the generation of the weights and vegetation optical depth (VOD).

The evolution of the 'sm_uncertainty' field per latitude over the duration of the C3S v201912 product is shown in Figure 35. It is obvious that uncertainty values before the year 2003 are missing in the current version (for the COMBINED and ACTIVE product). This issue should be resolved before the data is published. It is expected that the uncertainty associated with the product reduces over time. This indicates that in sensor periods closer to the present, the original products are much closer together in absolute values than in the sensor periods nearer the start of the C3S product. Throughout the product time period, the uncertainties are always higher at latitudes where there is higher vegetation cover, for example at 10 degrees south. This is as expected as soil moisture is harder to retrieve in these areas and there is higher variance in the product where this is the case. 

Figure 35: Monthly averages of the uncertainty variable associated with the C3S SM v201912 COMBINED product per latitude over time. Values before 2003 are missing and should be added to the final COMBINED and ACTIVE (not shown) product!

3. Application(s) specific assessments 

Currently, no application(s) specific assessments have been undertaken for v201912 of the C3S soil moisture dataset. However, to demonstrate the usefulness of the product, we present here a couple of examples of the previous product (v201812) capturing particular events within the past year.

3.1. European State of the Climate 2019

The C3S data has been used in the 2019 European State of the Climate report produced by ECMWF12. In the report, the C3S SM v201812 PASSIVE SM anomaly data is compared against ERA5 SM (Figure 36). The anomalies shown match well with ERA5. 

Figure 36: Annual soil moisture anomaly for 2019. From the European State of the Climate report.

12 Relevant sections of the report can be found here: https://climate.copernicus.eu/ESOTC/2019/european-wet-and-dry-conditions

3.2. Australia Drought

Australia experienced both its driest and warmest year since records began- it received only 60% of its usual rainfall while the average annual temperature was 1.5 °C above normal13. These conditions resulted in strong negative soil moisture anomalies throughout the continent and primed the land for catastrophic wildfires, which began in June 2019. C3S v201812 COMBINED (dekadal) SM anomalies for December 2019 are shown in Figure 37 and indicate the peak of the Australian 2019 drought.


Figure 37: Soil Moisture anomalies for December 2019 in the C3S SM v201812 COMBINED product (available via https://dataviewer.geo.tuwien.ac.at/).
 

4. Compliance with user requirements

The requirements for the C3S soil moisture product are a set of KPIs defined from consideration of user and GCOS requirements. The KPIs are shown in Table 6.

Table 6: Key Performance Indicators (KPIs) for the C3S Soil Moisture Product

KPI #

KPI Title

Performance Target and Unit of Measure

Accuracy KPIs

KPI.D1.1

CDR Radiometer with a daily resolution in latest quarter

Variable (0.01-0.10 m³ / m³), depending on land cover and climate

KPI.D2.1

CDR Scatterometer with a daily resolution in latest quarter

KPI.D3.1

CDR Combined with a daily resolution in latest quarter

KPI.D4.1

ICDR Radiometer with a daily resolution in latest quarter

KPI.D5.1

ICDR Scatterometer with a daily resolution in latest quarter

KPI.D6.1

ICDR Combined with a daily resolution in latest quarter

Stability KPIs

KPI.D1.2

CDR Radiometer with a daily resolution in latest quarter

0.05 m³ / m³ / y

KPI.D2.2

CDR Scatterometer with a daily resolution in latest quarter

KPI.D3.2

CDR Combined with a daily resolution in latest quarter

KPI.D4.2

ICDR Radiometer with a daily resolution in latest quarter

KPI.D5.2

ICDR Scatterometer with a daily resolution in latest quarter

KPI.D6.2

ICDR Combined with a daily resolution in latest quarter

An independent accuracy assessment has been conducted comparing the product against in-situ observations with different conditions taken into account, i.e. soil depth, land cover etc. In general, there is a variability in the correlation between the datasets, with correlations ranging from between 0.4 to above 0.8 depending on the conditions and the locations of the in-situ stations used. The ubRMSD, which can be directly taken as a measure of accuracy, is demonstrated to be below 0.10 m3 / m3 for all of the different conditions analysed, including land cover type (currently cropland, grassland and tree cover are considered). Therefore, the minimum threshold KPI for the C3S product for accuracy has been met globally. However, the GCOS target requirement of 0.04 m3/m3 is only met under certain land cover (grassland) and climate (semi-arid) conditions. Notice, however that the use of in-situ data for satellite validation is complicated by the presence of representativeness errors, which inflates the actual errors. The effect of representativeness will be addressed in future quality assurance activities, even as a finer categorisation of the product skill according to land cover type or vegetation cover.

A comparison against the LSMs GLDAS v2.1, ERA5 and ERA5-Land has also been undertaken to provide a wider global view of the product. The comparison against GLDAS v2.1 has shown expected results, with the KPI threshold of 0.1 m3 / m3 met in almost all regions (the exception being few areas with high topographic complexity). However, for the ERA5 and ERA5-Land comparison there are several areas (especially in the North) where the KPI threshold target is not met. Further investigation is need to ascertain the source of these discrepancies; it is possible these are a result of insufficient masking in the SM product itself or due to low data coverage in combination with high uncertainties which are generally present in satellite SM retrievals in subarctic areas.

The stability of the C3S product has been assessed in terms of the change in yearly accuracy compared to ISMN stations for all years after 2000.The accuracy between the products (ubRMSD) has been calculated per year and the trends in the accuracy were also analyses. The KPI threshold for stability of 0.05 m³ / m³ / y is met when assessed using this method for all tested stations. More methods to assess SM stability are currently under investigations and will be presented in future evaluation studies.

5. Annex A: Outcomes of the ACTIVE and PASSIVE quality control

5.1. Introduction

The ACTIVE and PASSIVE products of C3S SM v201912 were also compared to the reference data sets described in Section 2. As expected, the COMBINED product outperforms the PASSIVE and ACTIVE product in most locations as shown in Figure 38 for the comparison to GLDAS v2.1.

Figure 38: Intercomparison of the COMBINED, ACTIVE, PASSIVE product of C3S v201912, with GLDAS v2.1 as the (scaling) reference – plots shown are for Pearson's R (left) and ubRMSD (right). Only common locations and timestamps in all intercompared products are considered.

Two outcomes of the validation, that were already found for the previous version are shortly discussed as well, as they are most likely affected by changes planned for the next versions of C3S SM.

  1. The before mentioned negative break in the PASSIVE product, which is due to there being not overlapping period between AMSR-E and AMSR-2.
  2. The HSAF ASCAT SSM CDR (and therefore the ACTIVE product of C3S after 2007) is subject to wetting trends over (densely populated) RFI affected areas (Europe in particular).These trends are not present in the PASSIVE product and weaker in the COMBINED.

Issue 1 will most likely be addressed as part of the algorithm development in the CCI+ project in the next release. Issue 2 is subject to further development of H SAF ASCT SSM.

5.2. Strong negative trends in the PASSIVE product

Quality control on ESA CCI SM v04 (equivalent to C3S v201912) revealed that in some locations, the PASSIVE product shows strong negative trends which are not apparent in the COMBINED or ACTIVE products (see Figure 39). The time series show that after approximately 2011, there is a strong dip in values in the PASSIVE product.  

Figure 39: Hovmoeller diagram of ESA CCI SM v4.7 PASSIVE SM anomalies (same algorithm used in C3S v201912), which shows the drop in SM after 2011 (climatology period 1991-2010).

To identify the extent to which this seemingly unrealistic negative trend occurs within the product, a comparison of the trends in the COMBINED and PASSIVE products has been undertaken (see Figure 40) for the period 2007-01-01 to 2019-12-31. It is shown in the PASSIVE dataset (bottom panel of Figure 40), that there are several areas where the negative trend is particularly strong (Brazil, Europe and China). This break in the PASSIVE product is corrected in the upcoming version of ESA CCI SM (v5) and will therefore be reduced in the next version of the C3S SM CDR. 

Figure 40: Comparison of the Theil-Sen trends (median slope) in theC3S v201912 COMBINED (top left), ACTIVE (top right) and PASSIVE (bottom) products for the period 2007-01-01 to 2019-12-31. Note: different scales are shown on these maps such that the colours shown in each case are similar to one another; this is a result of the different absolute values provided in each data product.

5.3. Wetting trends in the ASCAT data

The wetting trend in ASCAT described in Section 5.2 also affects anomalies derived from the C3S SM products. Figure 41 shows anomalies over Europe for the year 2019 as in ESA CCI SM v4 data set (same algorithm as C3S v201912, left column). The right column shows the same anomalies when using an experimental version of ASCAT SSM with trend correction in the backscatter dry and wet reference in ESA CCI SM. Artificial trends in radar observed backscatter signal are probably due to landcover changes and/or RFI (Ticconi et al. 2017). Cities in particular stand out in the SM anomaly maps of the uncorrected ACTIVE product. The issue was first detected over Europe, but also affects other regions worldwide.

It is also noted that the issue seems most prominent in Spring months and therefore may be (partly) related to the vegetation correction used in the ASCAT product. 


Without ASCAT backscatter
trend correction

With ASCAT backscatter
trend correction

COMBINED

ACTIVE

PASSIVE

Figure 41: Change in ESA CCI SM v4 annual anomalies (for the year 2019) due to backscatter trend correction in ASCAT SSM. The climatological period for all plots is from 1991-2010. Note the difference scales for each of the three products.

If both identified issues, the wetting trend in the ACTIVE product and the (negative) break in the PASSIVE product are resolved, it is expected that a better agreement between the three single products will be achieved in terms of SM anomalies and long-term trends. The trend corrected product of ASCT SSM will be used in ESA CCI SM and C3S SM if officially released and provided by H-SAF. 

6. Annex B: Detailed comparison of C3S v201912 against C3S v201812

6.1. Introduction

This Annex provides a detailed comparison of the newest dataset version (v201912) against the previous version (v201812). The aim is to determine both that the dataset has been made to specification. Differences between the products, which may of interest to users when using the products in their own applications, are highlighted.

6.2. Comparison of data coverage

The number of valid observations available in each product version has been compared for the ACTIVE, PASSIVE and COMBINED products. Both the PASSIVE and the ACTIVE products show very little difference in terms of coverage; the most notable point being the increase in data in the COMBINED dataset in earlier periods as shown in Figure 42. The unexpected increase in number of observations is also shown in Figure 43. 

 

Figure 42: Difference between the data coverage (fraction of valid observations) for C3S v201812 and C3S v201912 for the COMBINED product for the entire time series. Positive values indicate more data is available within the v201912 product. 


Figure 43: Increase in number of observations in C3S SM v201912 compared to v201812 in the period after 2000-01-01 for the COMBINED (top), ACTIVE and PASSIVE (bottom) products. Transition from green to red at 365 observations (expected temporal extension).

The potential issue is also demonstrated for three selected time series. It should be noted that this might actually be an issue in the previous version rather than the current one, as the data gaps seem not be related to data flagging.

 Figure 44 shows the same issue with respect to three selected time series in different regions.




Figure 44: Comparison of time series at locations where an unexpected increase in data was coverage was found between v201812 and v201912 (COMBINED). Time series are aggregated by 10-daily averages. GPI locations are shown on the map.

6.3. Comparison of time series

The locations, which are compared here for the different product versions, are the same as in Figure 45 in Section 2.6. Overall, the products appear to be similar at these locations. For one point (GPI 810025) more data is found in the new version in the overlapping time period (as was also already shown in Figure 44); also the temporal extension of the product is clearly visible.

 

Figure 45: Time series for the different land cover classes considered (GPI locations shown in Figure 32). Showing the data for the COMBINED product from v201812 and v201912 aggregated to 10-daily time steps.

6.4. Comparison of daily images

Daily images for 2018-07-01 for each of the ACTIVE, PASSIVE and COMBINED products have been compared for C3S v201812 and v201912 (difference between them). The COMBINED product only is shown in Figure 46 (the ACTIVE product shows no difference at all, the PASSIVE even less than the COMBINED). Figure 46 shows that there are some small differences between the versions, which might be due to changes in the input data streams. However the reason for this should be investigated together with the potential other issues highlighted in this report.


Figure 46: Difference between C3S SM v201912 and v201812 for the daily COMBINED product on 2018-07-01

6.5. Comparison of global statistics

To demonstrate the differences between the previous C3S version (v201812) and the current dataset (v201912), the global statistics have been computed for each dataset version and are provided in Table 7. These are for the latest merging period of each product, however only the observations up to 2018-12-31 have been used (to ensure a reliable comparison of the statistics).

The statistics for the latest merging period show very little difference between the data product versions and indicate that there are only small differences in the COMBINED product, even smaller differences in PASSIVE and no changes in ACTIVE. 

Table 7: Dataset statistics for the different C3S versions CDRs for the latest merging period for each product for the common time period. The numbers given are the mean values across all GPIs in the dataset, i.e. the mean of the time series for one GPI is calculated and then the mean is taken from all GPIs.

Metric

COMBINED

ACTIVE

PASSIVE

v201812

v201912

v201812

v201912

v201812

v201912

Mean

0.21

0.21

45.17

45.17

0.32

0.32

Median

0.21

0.20

44.02

44.02

0.31

0.31

Std. dev.

0.05

0.05

16.13

16.13

0.09

0.09

Max

0.35

0.36

89.51

89.51

0.62

0.62

Min

0.10

0.10

6.07

6.07

0.09

0.09

References

(JCGM), J.C.F.G.I.M. (2008). International vocabulary of metrology — Basic and general concepts and associated terms (VIM). VIM3: International Vocabulary of Metrology, 3, 104

Albergel, C., Dorigo, W., Reichle, R.H., Balsamo, G., de Rosnay, P., Muñoz-Sabater, J., Isaksen, L., de Jeu, R., & Wagner, W. (2013). Skill and Global Trend Analysis of Soil Moisture from Reanalyses and Microwave Remote Sensing. Journal of Hydrometeorology, 14, 1259-1277

BSI (2015). BSI Standards Publication BS EN ISO 9000:2015 Quality management systems Fundamentals and vocabulary. 

de Jeu, R., van der Schalie, R., Paulik, C., Dorigo, W., Pasik, A., Scanlon, T., Kidd, R., & Reimer, C. (2020). C3S Algorithm Theoretical Basis Document (ATBD): Soil Moisture (v201912). In

Dorigo, W., Scanlon, T., Buttinger, P., Pasik, A., Paulik, C., & Kidd, R. (2020a). C3S Product User Guide (PUG) and Specification. In 

Dorigo, W., Scanlon, T., Preimesberger, W., Buttinger, P., & Kidd, R. (2019). C3S Product Quality Assurance Document (PQAD): Soil Moisture (v201812). In 

Dorigo, W., Scanlon, T., Preimesberger, W., Pasik, A., Buttinger, P., & Kidd, R. (2020b). C3S Product Quality Assurance Document (PQAD): Soil Moisture (v201912). In 

Dorigo, W., Van Oevelen, P., Wagner, W., Drusch, M., Mecklenburg, S., Robock, A., & Jackson, T. (2011). A new international network for in situ soil moisture data. Eos, 92, 141-142

Dorigo, W., Wagner, W., Albergel, C., Albrecht, F., Balsamo, G., Brocca, L., Chung, D., Ertl, M., Forkel, M., Gruber, A., Haas, E., Hamer, P.D., Hirschi, M., Ikonen, J., de Jeu, R., Kidd, R., Lahoz, W., Liu, Y.Y., Miralles, D., Mistelbauer, T., Nicolai-Shaw, N., Parinussa, R., Pratola, C., Reimer,

C., van der Schalie, R., Seneviratne, S.I., Smolander, T., & Lecomte, P. (2017a). ESA CCI Soil Moisture for improved Earth system understanding: State-of-the art and future directions. Remote Sensing of Environment
Dorigo, W.A., & Gruber, A.a. (2015). Evaluation of the ESA CCI soil moisture product using ground-based observations. Remote Sensing of Environment, 162, 380--395

Dorigo, W.A., Scanlon, T., & Chung, D. (2017b). C3S Product Quality Assurance Document (PQAD): Soil Moisture. In 

Dorigo, W.A., Xaver, A., Vreugdenhil, M., Gruber, A., Hegyiová, A., Sanchis-Dufau, A.D., Zamojski, D., Cordes, C., Wagner, W., & Drusch, M. (2013). Global Automated Quality Control of In Situ Soil Moisture Data from the International Soil Moisture Network. Vadose Zone Journal, 12, 0

Entekhabi, D., Reichle, R.H., Koster, R.D., & Crow, W.T. (2010). Performance Metrics for Soil Moisture Retrievals and Application Requirements. Journal of Hydrometeorology, 11, 832-840

Fang, L., Hain, C.R., Zhan, X., & Anderson, M.C. (2016). An inter-comparison of soil moisture data products from satellite remote sensing and a land surface model. International Journal of Applied Earth Observation and Geoinformation, 48, 37--50

Gelaro, R., McCarty, W., Suárez, M.J., Todling, R., Molod, A., Takacs, L., Randles, C.A., Darmenov, A., Bosilovich, M.G., Reichle, R., Wargan, K., Coy, L., Cullather, R., Draper, C., Akella, S., Buchard, V., Conaty, A., da Silva, A.M., Gu, W., Kim, G.-K., Koster, R., Lucchesi, R., Merkova, D., Nielsen, J.E., Partyka, G., Pawson, S., Putman, W., Rienecker, M., Schubert, S.D., Sienkiewicz, M., & Zhao, B. (2017). The Modern-Era Retrospective Analysis for Research and Applications, Version 2 (MERRA-2). Journal of Climate, 30, 5419-5454

Gruber, A., Dorigo, W.A., Crow, W., Wagner, W., & Member, S. (2017). Triple Collocation-Based Merging of Satellite Soil Moisture Retrievals, 1-13

Gruber, A., Dorigo, W.A., Zwieback, S., Xaver, A., & Wagn, W. (2013). Characterizing Coarse-Scale Representativeness of in situ Soil Moisture Measurements from the International Soil Moisture Network. Vadose Zone Journal, 12

Liu, Y.Y., Dorigo, W.A., Parinussa, R.M., De Jeu, R.A.M., Wagner, W., McCabe, M.F., Evans, J.P., & Van Dijk, A.I.J.M. (2012). Trend-preserving blending of passive and active microwave soil moisture retrievals. Remote Sensing of Environment, 123, 280-297

Plummer, S., Lecomte, P., & Doherty, M. (2017). The ESA Climate Change Initiative (CCI): A European contribution to the generation of the Global Climate Observing System. Remote Sensing of Environment

Preimesberger, W., Scanlon, T., Su, C.H., Gruber, A., & Dorigo, W. (In prep.). Homogenization of struvtural breaks in a global multi-satellite soil moisture climate data record

Rienecker, M.M., Suraez, M.J., Gelaro, R., Tolding, R., Bacmeister, J., Liu, E., Bosilovich, M.G., Schurbert, S.D., Takacs, L., Kim, G.-K., Bloom, S., Chen, J., Collins, D., Conaty, A., Silva, A.D., Gu, W., Joiner, J., Koster, R.D., Lucchesi, R., Andera, M., Owens, T., Pawson, S., Pegion, P., Redder, C.R., Reichle, R., Robertson, F.R., Ruddick, A.G., Sienkiewicz, M., & Woollen, J. (2011). MERRA : NASA's Modern-Era Retrospective Analysis for Research and Applications. Journal of Climate, 24, 3624-3648

Su, C.H., Ryu, D., Dorigo, W., Zwieback, S., Gruber, A., Albergel, C., Reichle, R.H., & Wagner, W. (2016). Homogeneity of a global multisatellite soil moisture climate data record. Geophysical Research Letters, 43, 11,245-211,252

Ticconi, F., Anderson, C., Figa-Saldana, J., Wilson, J.J.W., & Bauch, H. (2017). Analysis of Radio Frequency Interference in Metop ASCAT Backscatter Measurements. Ieee Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 10, 2360-2371

Wagner, W., Dorigo, W., de Jeu, R., Fernandez, D., Benveniste, J., Haas, E., & Ertl, M. (2012). Fusion of Active and Passive Microwave Observations To Create an Essential Climate Variable Data Record on Soil Moisture. ISPRS Annals of Photogrammetry, Remote Sensing and Spatial Information Sciences, I-7, 315--321

WMO (2016). The Global Observing System for Climate (GCOS): Implementation Needs. (p. 325)

Yang, H., Weng, F., Lv, L., Lu, N., Liu, G., Bai, M., Qian, Q., He, J., & Xu, H. (2011). The FengYun-3 Microwave Radiation Imager On-Orbit Verification. IEEE Transactions on Geoscience and Remote Sensing, 49, 4552-4560


This document has been produced in the context of the Copernicus Climate Change Service (C3S).

The activities leading to these results have been contracted by the European Centre for Medium-Range Weather Forecasts, operator of C3S on behalf of the European Union (Contribution agreement signed on 22/07/2021). All information in this document is provided "as is" and no guarantee or warranty is given that the information is fit for any particular purpose.

The users thereof use the information at their sole risk and liability. For the avoidance of all doubt , the European Commission and the European Centre for Medium - Range Weather Forecasts have no liability in respect of this document, which is merely representing the author's view.

Related articles