Contributors: J. Drücke (Deutscher Wetterdienst (DWD))

Issued by: J. Drücke (DWD)

Date: 18/09/2023

Ref: C3S2_D312a_Lot1.1.5.3_202309_PQAD_ECV_WaterVapour_COMBI_v1.2

Official reference number service contract: 2021/C3S2_312a_Lot1_DWD/SC1

Table of Contents

History of modifications

Version

Date

Description of modification

Chapters/Sections

V1.0

29/12/2022

First version

All

V1.1

02/05/2023

All reviewer comments addressed

All

V1.2

18/09/2023

Additional reviewer comments addressed

Chapter 1

List of datasets covered by this document

Deliverable ID

Product title

Product type (CDR, ICDR)

Version number

Delivery date

D2.5.9

Water Vapour TCWV WV_cci/CMSAF TCDR v1.0

CDR

V1.0

29.12.2022

Related documents

Reference ID

Document

D1

Validation Report,  Microwave and near-infrared imager TCDR - Combined high resolution global TCWV from microwave and near infrared imagers (COMBI)

Ref: SAF/CM/DWD/VAL/COMBI; Issue 1.0

https://www.cmsaf.eu/SharedDocs/Literatur/document/2022/saf_cm_dwd_val_combi_tcdr_v1_0_pdf.pdf

Last accessed on 25.10.2023

D2

Product Validation Plan (PVP) v3.2 – Water Vapour CCI (to be published)

D3

Data Access Requirement Document (DARD), ESA Water_Vapour_cci, version v3.2, 2021

Ref: CCIWV.REP.003

https://climate.esa.int/media/documents/Water_Vapour_cci_D1.3_DARD_v3.2.pdf

Last accessed on 25.10.2023

D4

Algorithm Theoretical Basis Document - Microwave and near-infrared imager TCDR - Combined high resolution global TCWV from microwave and near infrared imagers (COMBI)

Ref: SAF/CM/DWD/ATBD/COMBI/; Issue 1.0

https://www.cmsaf.eu/SharedDocs/Literatur/document/2022/saf_cm_dwd_atbd_combi_tcdr_v1_0_pdf.pdf

Last accessed on 25.10.2023

D5

User Requirements Document (URD), ESA Water Vapour Climate Change Initiative (WV_cci), version 3.0, 2021

Ref: CCIWV.REP.001

https://climate.esa.int/media/documents/Water_Vapour_cci_D1.1_URD_v3.0.pdf

Last accessed on 25.10.2023

Acronyms

Acronym

Definition

AIRS

Atmospheric Infrared Sounder project

AMSU

Advanced Microwave Sounding Unit

ARSA

Analyzed RadioSoundings Archive

C3S

Copernicus Climate Change Service

CCI

Climate Change Initiative

CDR

Climate Data Record

CM SAF

EUMETSAT Satellite Application Facility on Climate Monitoring

COMBI

Combined global near-infrared (NIR) and microwave imager (MW) Total Column Water Vapour (TCWV) data record

cRMSD

bias corrected root-mean-square deviation

ECMWF

European Centre for Medium-Range Weather Forecasts

EDA

ERA5 – reduced resolution ten member ensemble

ERA5

ECMWF Re-Analysis 5

ESA

European Space Agency

EUMETSAT

European Organisation for the Exploitation of Meteorological Satellites

GCOS

Global Climate Observing System

GOME

Global Ozone Monitoring Experiment

GRUAN

GCOS Reference Upper-Air Network

HOAPS

Hamburg Ocean Atmosphere Parameters and Fluxes from Satellite Data

HRES

ERA5 – High Resolution

ICDR

Interim Climate Data Record

IMS

Infrared Microwave Sounding

ITCZ

Inter-Tropical Convergence Zone

MERIS

Medium Resolution Imaging Spectrometer Instrument

MMW

Merged Microwave

MODIS

Moderate Resolution Imaging Spectrometer

MWI

Microwave Imager

NASA

National Aeronautics and Space Administration

NIR

Near Infra Red

OLCI

Ocean and Land Colour Instrument

PMF

Penalised Maximal F

PVP

Product Validation Plan

REMSS

Remote Sensing System project

RSD

Robust Standard Deviation

SNH

Standard Normal Homogeneity

SuomiNet

Global ground based GPS network (named after Verner Suomi)

TCDR

Thematic Climate Data Record

TCWV

Total Column Water Vapour

WV_cci

Water Vapour climate change initiative

List of tables

Table 1-1: List of the satellite instruments, which are used for generating the CDR-2 dataset.

Table 2-1: List of all datasets used for validation or comparison

Table 3-1: Product requirements for the Water Vapour TCWV TCDR as defined by CM SAF

Table 4-1: Stability estimates for CDR-1 and CDR-2

Table 4-2: Summary of statistics and uncertainty assessment for CDR-1 and CDR-2

List of figures

Figure 1-1: Instruments used for TCWV WV_cci/CM SAF (COMBI) product.

Figure 4-1: Spatial maps of bias (left) and cRMSD (right)

Figure 4-2: Timeseries of bias (top) and cRMSD (bottom) over global ice-free ocean surfaces

Figure 4-3: Timeseries of bias (top) and cRMSD (bottom) over global surfaces for different reference and comparison datasets

Figure 4-4: Timeseries of bias (top) and cRMSD (bottom) over global land and ocean surfaces without coast, sea ice and inland water bodies

Figure 4-5: Scatter plots of daily TCWV against TCWV of Merged Microwave over ice-free ocean (top) and against AIRS over global surfaces (bottom)

Figure 4-6: Timeseries of bias for CDR-2 against Merged Microwave over global ice-free ocean surfaces

Figure 4-7: Timeseries of bias for CDR2- against AIRS (top) and ERA5 (bottom) over global surfaces

Figure 4-8: Immler inequation plotted for consistency analysis between CDR-2 and Merged Microwave over ice-free ocean

General Definitions

Table 1: Definition of statistical metrics

Statistical metrics

Definition

Bias

An estimate of the systematic error/uncertainty arising from systematic effects. Here, the bias is typically estimated as the systematic (mean) difference to a consensus reference.

cRMSD

Bias corrected root-mean-square difference – an estimate of the standard uncertainty. It is noted that the RMSD also includes the uncertainty of the reference observations. Only if the reference uncertainty is well defined, and of an order of magnitude, or more, smaller than the uncertainty of the retrieved or gridded values, can it be considered as an estimate of the standard uncertainty.

Immler Inequation

Immler inequation assesses the consistency of a CDR and a reference dataset. It is analysed by comparing the difference of CDR and reference dataset to their associated uncertainty estimates (Immler et al., 2010).

Optimal requirement

This is a level where a product is considered to perform much

better than expected given the current knowledge.

PMF

The Penalized Maximal F (PMF) test detects undocumented mean shift that are not accompanied by any sudden change in the linear trend of time series (Wang, 2008a, b)

RSD

The robust standard deviation (RSD) is a measure of spread, which is not affected by extremely high or low values like the standard deviation.

Round Robin

A round robin is an arrangement in which all the items in a group are selected evenly in a rational order, usually from the beginning to the end of a list and then again at the beginning of the list, and so on. In this specific case the validation process follows a defined flow with e.g. looking for overlap periods/matchups, storage of matches in a database and calculation of differences for matches in space and time.

SNH

The standard normal homogeneity test is a method to identify inhomogeneities in a time series by comparing it with a homogeneous reference dataset.

Stability

Stability is the time rate at which systematic errors in the CDR change (Merchant et al., 2017). Here, the stability is assessed as the change of the bias over time, usually per decade.

Target requirement

This is the main quality goal for a product. It should reach

this level based on the current knowledge on what is reasonable to achieve.

Threshold requirement

A product should at least fulfil this level to be considered

useful at all. Sometimes the term ‘Breakthrough” is used instead.

Uncertainty due to representativeness

Part of the calculation of the consistency equation between CDR and reference data. Consistency is assessed by several factors among the representativeness σ. The term is usually not known but assumed to be small. Because gridded data records are validated the representativeness error is neglected. It is not recommended using station-based data instead of gridded data records and could lead to an underestimation of the total uncertainty.

Table 2: Definition of variables and CDR's

Variables and CDR

Definition

TCWV

Total Column Water Vapour is the integrated mass of gaseous water in the total column of the atmosphere over an area of 1 m²

CDR-1

Contains daily and monthly TCWV for the global land areas and is a combination of MERIS, MODIS and HOAPS available from 2002 – 2017 (equals CDR-2 land only).

CDR-2

Combination of WV_cci CDR-1 with EUMETSAT CM SAF HOAPS (microwave imager based) data records (global coverage) available from 2002 – 2017.

Table 3: Definition of various jargons

Various jargons

Definition

Brokered product

The C3S Climate Data Store (CDS) provides both data produced specifically for C3S and so-called brokered products. The latter are existing products produced under an independent programme or project which are made available through the CDS.

Level-2

Retrieved TCWV variable at full input data resolution, thus with the same resolution and location as the sensor measurements.

Level-3

TCWV of Level-2 orbits of one single sensor combined (averaged) on a global spacial grid.

Scope of the document

This document is the Product Quality Assurance Document (PQAD) for the combined global near-infrared (NIR) and microwave imager (MW) Total Column Water Vapour (TCWV) data record (COMBI). It provides a description of the product validation methodology and presents the validated products and the datasets used during the process.

This Climate Data Record (CDR) is a product brokered from the EUMETSAT Satellite Application Facility on Climate Monitoring (EUMETSAT CM SAF) service which corresponds to their CDR-2 dataset. This document refers extensively to the original product validation plan (PVP v3.2) document [D2] created within the Climate Change Initiative (CCI) Water Vapour project, as well as to the CM SAF validation report, “CM SAF Validation Report Combined high resolution global TCWV from microwave and near infrared imagers (COMBI)” [D1]. It can be found at the CM SAF web site1.

1 https://www.cmsaf.eu/EN/Home/home_node.html

Executive summary

The project C3S2_312a_Lot1 includes brokering of a combined high-resolution global Total Column Water Vapour data record from microwave and near-infrared imagers (COMBI) from the EUMETSAT CM SAF.

The COMBI product combines near-infrared (NIR)-based retrievals over land, coasts and sea-ice and CM SAF HOAPS data over open ocean. The NIR-algorithm development and the combination of NIR and HOAPS-based products has been (further) developed and implemented as part of the ESA CCI Water Vapour project (WV_cci). The COMBI product corresponds to Climate Data Record (CDR-2) from the WV_cci project.

The CDR-2 covers the time period July 2002 to December 2017.

In the scope of the validation activity, the comparison has been performed for accuracy and stability of the product with the same method as within the CM SAF validation activity for total column water vapour (2002-2017) [D1]. A wide range of reference and comparison datasets (e.g. satellite, reanalysis) as well as different approaches has been used to assess the quality of the TCWV product subdivided into different surface types. Section 3 gives details on methods concerning the validation.

The general picture of validation (Section 4) presented in this document, shows that CDR-2 fulfills the threshold product requirements and frequently meet the target product requirements (based on Table 3-1). It is observed that the quality over inland water bodies, sea-ice and coastal regions is reduced.

1. Validated products

The validation includes the Climate Data Record CDR-2 from the ESA CCI Water Vapour project (WV_cci) datasets, which contains gridded monthly and daily time series of TCWV in units of kg/m² each with a spatial resolution of 0.05° and 0.5° that cover the global land and ocean areas. The COMBI product covers the period July 2002 to December 2017, and combines WV_cci CDR-1 with EUMETSAT CM SAF HOAPS (microwave imager based) data records (see Table 1-1 and Figure 1-1 for summary). This combination corresponds to the CDR-2 dataset which is released by EUMETSAT CM SAF.

Table 1-1List of the satellite instruments, which are used for generating the CDR-2 dataset. The data from MERIS/MODIS/OLCI results in the dataset CDR-1. The combination with EUMETSAT CM SAF HOAPS leads to the TCWV COMBI product.

Instrument name

Spatial resolution

Temporal resolution

Temporal coverage

Sensors

Spatial coverage

CDR

MERIS

1200 m

Global coverage in 3 days

2002 - 2012

NIR

Land, coastal areas and sea-ice

WV_cci (CDR-1)

MODIS

1 km

Daily global coverage

2012 - 2016

NIR

Land, coastal areas and sea-ice

WV_cci (CDR-1)

OLCI

300 m, 1200 m

Global coverage in 2 days

2016 onwards

NIR

Land, coastal areas and sea-ice

WV_cci (CDR-1)

SSM/I

25 km

Daily global coverage

1987 - 2003

MWI

Global ice-free ocean

CM SAF HOAPS

SSMIS

25 km

Daily global coverage

2003 - 2016

MWI

Global ice-free ocean

CM SAF HOAPS



Figure 1-1Instruments used for TCWV WV_cci/CM SAF (COMBI) product. Bars indicate contribution to the product CDR-2. MERIS, MODIS, and OLCI provide data for land and sea ice, while HOAPS provides data for the global ice-free ocean. Note: Slightly changed from [D5, Figure 5-1].

Please note, that this document only covers the metrics and validation of CDR-2, i.e. the combined product of WV_cci CDR-1 and CM SAF HOAPS (microwave imager based).

2. Description of validating datasets

The evaluation of TCWV data products and their related uncertainties has been carried out based on reference and intercomparison datasets as described in the Product Validation Plan (PVP v3.2) [D2].

TCWV measurements from SuomiNet stations are used for comparison with TCWV from MERIS, MODIS, OLCI over global land surfaces. The high number of stations (ca. 150), the availability of the timeseries since 2005 and the high temporal resolution of the station data ensures a good reference dataset. The data is also used for a Round robin (more information in PVP v2.3 Ch.4.1 [D2].

The data of 12 Global Climate Observing System (GCOS) Reference Upper-Air Network (GRUAN) radiosonde sites are used for validation of the TCWV data over global land. The data is widely used and has been available since June 2005 and is therefore appropriate for validation.

The European Centre for Medium-Range Forecasts (ECMWF) Re-Analysis 5 (ERA-5) is a global reanalysis and is used for global validation inclusive land surfaces and ice-free ocean. The data is available from 1940 onwards with an hourly resolution and a spatial resolution is 0.25°/0.5° (HRES/EDA).

The Atmospheric Infrared Sounder (AIRS) is located on NASA’s Aqua satellite and its data is also used for global validation. The dataset is available from 2002 – 2016 and provides monthly and daily data. The spatial resolution is 1°.

The HOAPS and MERIS dataset is available from 2003 to 2012 and is used for global comparison with its successor product. The dataset provides monthly data and the spatial resolution is 0.5°.

The merged microwave climate product is created within Remote Sensing Systems (REMSS) and provides total column water vapour data over the ocean. It includes the time series from 1988 to present on a monthly basis and a spatial resolution of 1°. The data is used for comparison over global ice-free ocean.

The Global Ozone Monitoring Experiment (GOME) instrument measured total column water vapour and is used for validation over global and over ice-free ocean areas. The data is available from 1995 – 2015 on a monthly basis with a 1° resolution.

The Infrared Microwave Sounder (IMS) dataset provides vertical profiles of water vapour and is available from 2007 – 2016. It is used for comparison with TCWV data globally and over ice-free ocean areas and it is also used for the round robin effort.

The validation was carried out against the reference and comparison data records as listed in

Table 2-1. In addition, the following datasets are used for the spatial assessment of biases and cRMSD:

  • Merged Microwave by Remote Sensing System project (REMSS MMW),

  • ERA5, AIRS (including AMSU) version 6,

  • the predecessor version to the WV_cci TCWV CDR-2 by C3S, GOME Evolution (GOME), and

  • IMS.

Table 2-1List of all datasets used for validation or comparison, where orange highlighted datasets were used for global land surfaces (CDR-1), red were used for global coverage (CDR-1 and -2), whereas blue datasets were used for global and global ice-free ocean areas.

Name

Temporal coverage / resolution

Spatial coverage / resolution

Area of validation

GRUAN

2005/06 – present / twice daily

12 stations

Global land surfaces (CDR-1)

Suominet

2005 – present / up to half-hourly

Up to 150 stations

Global land surfaces (CDR-1)

ERA5

1940 – present /

8-24x daily

Global, 0.25° / 0.5° (HRES/EDA)

Global (all) (CDR-2)

AIRS

2002 – 2016 / daily, monthly

95% global / 1°

Global (all) (CDR-2)

HOAPS + MERIS

2003 – 2012 / monthly

Global / 0.05°, 0.5°

Global (all) (CDR-2)

Merged Microwave

1988 – present / monthly

Global ice-free ocean / 1°

Global coverage and global ice-free ocean areas (CDR-2)

GOME Evolution Climate

1995 – 2015 / monthly

Global / 1°

Global coverage and global ice-free ocean areas (CDR-2)

IMS

2007/06 - 2016 / daily, monthly

Global / 0.25°

Global coverage and global ice-free ocean areas (CDR-2)


3. Description of product validation methodology

In this section an overview of the product validation methodology is given. A detailed explanation can be found in Chapter 3 in PVP v3.2 [D2].

3.1 Round Robin

For the validation of Level 2 TCWV data a round robin was applied, which will decide which of the instrument data records will be used for the generation of the TCWV climate data record. The round robin is further described in Chapter 4.1.1 in PVP v3.2 [D2].

3.2 Comparisons to reference

In this section an overview of the approaches, methods and metrics are provided, which were used for the comparisons to reference data, such as merged microwave (both from REMSS), AIRS, ERA5 and GOME Evolution Climate.

One part of the validation is the generation of climatological spatial maps and spatially averaged time series of bias (Equation 3.1) and cRMSD (Equation 3.2). The bias is estimated as:

\[ b(x,y) = \frac{1}{N} \sum\limits_{i=1}^{N} (x_i - y_i). \; (Eq. 3.1) \]

with N being the number of valid collocations and  \( x_i \) represents the WV_cci TCWV data and  \( y_i \) the corresponding reference data for comparison with i=1,...,N.

The bias corrected root-mean-square deviation cRMSD is calculated as: 

\[ cRMSD(x,y) = \sqrt{\frac{1}{N} \sum\limits_{i=1}^{N} (x_i - y_i -b)^2} \; (Eq. 3.2) \]

The analysis of homogeneity in CDR-2 includes the results from the Penalised Maximal F (PMF, Wang, 2008a, b) and the Standard Normal Homogeneity (SNH, Reeves et al., 2007) tests. The annual cycle has been removed from the monthly anomaly CDR and comparison datasets, and the difference of both anomalies is used as input to the test. Potential break points might be visible in the anomaly difference time series (further information in Chapter 5 [D1]).

The temporal stability β (Equation 3.3) is defined as the change of the bias over time. It can be estimated by linear regression analysis of the time series of differences between de-seasonalised datasets. It is typically defined as change in the monthly bias per decade:

\[ \beta = \frac{d}{dt} b \; (Eq. 3.3) \]

The consistency (Equation 3.4) of a CDR and a reference dataset is analysed by comparing the difference of the CDR and the reference to their associated uncertainty estimates (Immler et al., 2010) (see Figure 4-6, Figure 4-7 and Table 4-2). The consistency assessment of uncertainties is assessed considering all valid (daily) collocations on a monthly basis. Denoting the standard uncertainties (𝑢) in the CDR and reference data by and, respectively, and the uncertainty due to representativeness by σ:

\[ |x_i - y_i| < k\sqrt{ \sigma ^2 + {{u^2}_x}_i + {{u^2}_y}_i} \; (Eq. 3.4) \]

where k is the so-called coverage factor, with 𝑘=1 or 2 or 3 corresponding to significance levels of 68% and 95% and 99% respectively. If the condition expressed in equation 3.4 is not satisfied, this is an indication that the total uncertainty is underestimated or overestimated. If Equation 3.4 is valid for a single pair of values  \( (x_i - y_i) \) with 𝑘≤1, then the CDR and the reference are consistent at a significance level of 32%. They are also considered to be consistent, if Equation 3.4 is valid in 68% of the cases of multiple value pairs  \( (x_i - y_i) \) with 𝑘≤1. In practice, the reference measurement uncertainty and the uncertainty from representativeness σ are not always known. Because gridded data records are validated, the representativeness error is assumed to be small, i.e. σ=0.

Propagation of the uncertainties is carried out according to Equation 3.5 in Stengel et al. (2017):

\[ {{\sigma^2}_{\langle x \rangle}} = \frac{1}{N} {{\sigma^2}_{true}} + c \langle \sigma_i \rangle^2 + (1-c)\frac{1}{N} \langle \sigma_i \rangle^2 \; (Eq. 3.5) \]

where N is the number of valid values,  \( \sigma_{true} \) is the natural variability of the observed geophysical uncertainty,
\( \langle \sigma_i \rangle \) is the mean and  \( \langle \sigma_{i}^2 \rangle \) the mean of squares of the uncertainty. The factor c is the uncertainty correlation. The CDRs contain the standard deviation, \( \langle \sigma_i \rangle \) \( \langle \sigma_{i}^2 \rangle \) and the number of valid observations, the number of valid hours (over ocean only) and the number of valid days these will be utilised during validation. The scaling of the uncertainties with N is discussed further within the CM SAF Validation Report [D1] by using the number of valid observations and the number of valid hours/days. Then, an uncertainty correlation of c=0 is applied. This is the current best estimate of the propagated uncertainty. The impact of different c’s (see Equation 3.5) will be demonstrated as well within the consistency analysis.

The level of significance of compliance between achieved quality (bias, cRMSD, (total) uncertainty and stability) and requirement is analysed. The probability that the stability is smaller than a requirement is computed by integrating the Gaussian noise distribution using the 1-sigma noise level (either from regression or from standard deviation of the bias) within limits defined by the requirement. It gives the coverage probability of the stability or bias being within the requirement. Based on this, the p-value can be computed. The null hypothesis is that the stability or bias is outside the requirement and the alternative hypothesis is that the stability or bias is smaller than the requirement. The null hypothesis needs to be rejected if the coverage probability is larger than 95% (or 𝑝<0.05) (Loew et al., 2017).

The total uncertainty
\( \sigma_{total} \) is the sum of the uncertainty of the CDR, of the reference and of representativeness.

The robust standard deviation (𝑅𝑆𝐷) is defined as:

\[ RSD = median (|x_i - y_i| - median(x_i - y_i))*1.48 \; (Eq. 3.6) \]

and the bias-corrected 𝑅𝑆𝐷 is:

\[ RSD_{bias} = \sqrt{(RSD^2 + (median(x_i - y_i))^2)} \; (Eq. 3.7) \]

Finally, 2-D scatter plots between CDR-2 and merged microwave (merged microwave), as well as AIRS (global), are analysed with a focus on outliers and peaks in distribution. Refined spatio-temporal analysis can be carried out where and when spurious behaviour is observed. It is noted that at least the uncertainty estimates of the merged microwave product is incomplete and not validated. Also, these data records are defined here as reference which shall not be interpreted as an indication of superior quality. Thus, the validation on the basis of these uncertainty estimates is carried out for completeness and associated results might not be considered as a validation (similar for CDR-1).

The results of the analysis (see Table 4-2) are compared to the values of the product requirements (Table 3-1).

Table 3-1Product requirements for the Water Vapour TCWV TCDR as defined by CM SAF (see Table 7-2 in [D1]).

Category

Bias [kg/m²]

cRMSD [kg/m²]

Stability (bias trend) [kg/m²/decade]

Threshold

3.0

5.0

0.70

Target

1.0

3.0

0.20

Optimal

0.3

0.3

0.08

4. Summary of validation results

This section provides a summary of validation results for TCWV from the combined product, which are described in the Validation Report of CM SAF [D1] in more detail, especially also the validation for CDR-1 (equals CDR-2 land only). Here, the focus is on the combined product CDR-2. Please note, that this section strongly refers to the Validation Report of CM SAF [D1], in which detailed results can be found. A brief summary of the results are listed in the following subsections.

4.1 Spatial comparison between CDR and comparison data records

One part of the Validation Report [D1] focuses on the validation of the final, gridded Level 3 products from WV_cci, TCWV over land surfaces (CDR-1) and over land and ocean surfaces, i.e. globally (CDR-2) are considered. The CDR-2 covers global land and ocean areas as well as coast and sea ice regions. The analysis is divided into different surface types: land, ocean, sea ice, coast, different amount of cloud cover over land and regions with a lot of precipitation over the ocean. Different statistics were used depending on surface type. Please note that CDR-1 corresponds to CDR-2 over land.

Figure 4-1 shows the spatial distribution of bias and cRMSD between CDR-2 and the reference and comparison data records. The Merged Microwave dataset (REMSS) is only available over global ice-free ocean (otherwise white areas in plot). Distinct spatial patterns over Inter-Tropical Convergence Zone (ITCZ), storm track regions and rain forest are visible. The observed slightly dry and slightly wet bias is due to global cloud pattern. A wet (positive) bias (blue) means the CDR contains more water vapour than the reference data. The opposite for dry (negative) bias (red). The overall bias over land is negative, over the ocean positive, except the comparison with the Merged Microwave product. The cRMSD has its maximum in the ITCZ and a land/sea contrast is observed. The largest bias is over ocean relative to IMS and the largest cRMSD is over ITCZ and the storm track regions relative to GOME Evolution. An overall small bias relative to the Merged Microwave product, C3S and AIRS and generally larger bias relative to GOME and IMS is observed. A distinct bias relative to ERA5, GOME and cRMSD (ERA5, AIRS, GOME) can be seen over sea-ice regions. The bias of inland water bodies often has opposite signs depending on the predominant sensor. Therefore, a global map of water bodies by ESA CCI LandCover project was used.

Figure 4-1Spatial maps of bias (left) and cRMSD (right) of CDR-2 against reference and intercomparison datasets: a) ERA5, b) Merged Microwave (MMV), c) C3S precursor to WV_cci CDR-2, d) AIRS with AMSU version 6, e) GOME Evolution, f) IMS. Missing data is indicated in white in the cRMSD maps.

4.2 Comparison of timeseries between CDR and reference

In this section, the validation for the final CDR-2 fv3.2 level 3 data takes place for ice-free ocean and global surfaces separately. The comparisons are carried out against Merged Microwave, AIRS, C3S and ERA5 over global ice-free oceans and against AIRS, C3S and ERA5 over global surfaces (see PVP v3.2 [D2]).

4.2.1 Global and global ice-free ocean surfaces

Figure 4-2Timeseries of bias (top) and cRMSD (bottom) over global ice-free ocean surfaces.

Figure 4-2 shows the timeseries of the bias and cRMSD relative to reference and comparison datasets over global ice-free oceans. Here, only the microwave-based HOAPS data has been considered. The bias relative to Merged Microwave and C3S is about 0.0 kg/m² and 0.75 kg/m² to AIRS and ERA5. The cRMSD is around 0.8 kg/m² except for AIRS (1.1 kg/m²) and relatively constant throughout the time series.

Figure 4-3Timeseries of bias (top) and cRMSD (bottom) over global surfaces for different reference and comparison datasets.

Figure 4-3 shows the time series of the bias and cRMSD relative to the reference and comparison datasets over global surfaces. A bias of close to 0.0 kg/m² relative to ERA5 and C3S and 0.6 kg/m² relative to AIRS and GOME Evolution can be noticed.

4.2.2 Global ice-free ocean and land surfaces

As discussed in Section 4.1, the inland water bodies might lead to larger uncertainties. Also, coastal regions are challenging for the retrieval due to less reflected radiation in that area (ATBD part 1, v2.1 [D3]). Therefore, timeseries of bias and cRMSD have been analysed over global land and ocean surfaces without considering sea-ice, coasts and inland water bodies (see Figure 4-4). Compared to the global results in Figure 4-3 the bias is generally slightly larger. The cRMSD is generally smaller, except for GOME Evolution.

Figure 4-4Timeseries of bias (top) and cRMSD (bottom) over global land and ocean surfaces without coast, sea ice and inland water bodies.

4.2.3 Correlation

Figure 4-5 shows a scatter plot of daily CDR-2 TCWV against Merged Microwave over the global ice-free ocean (top) and against AIRS over global surfaces. Both plots show a high correlation with many data points around the one-to-one line. In the global comparison the spread is a bit wider. A few outliers are observed at small CDR-2 TCWV values, which might be due to the reduced quality of CDR-2 over sea-ice and coastal areas.

Figure 4-5Scatter plots of daily TCWV (CDR-2 fv3.2) against TCWV of Merged Microwave over ice-free ocean (top) and against AIRS over global surfaces (bottom). The linear regression yields R2 = 1.0 for Merged Microwave and R2 = 0.99 for AIRS. Due to the large amount of collocations, 100,000 data points were chosen randomly to provide density colour scale. Other data points are light blue. Grey data points are associated with surface types “coast” and “sea ice”.

4.3 Analysis of stability and homogeneity

The analysis of the temporal stability is based on the change of the bias with time. Figure 4-6 and Figure 4-7 presents the time series of the bias relative to AIRS and ERA5 for land covered regions and global, as well as Merged Microwave for ocean area. The vertical lines represent the break points determined in the homogeneity test. The stability values are provided in Table 4-1. The bias decreases over time for land areas (negative stability estimate) and increases over time for ocean areas (positive stability estimates). There is a positive bias on a global scale causes by the dominance of the ocean.

Overall the stability of the clear-sky data exceeds the target requirement of 0.2 kg/m²/decade for the time series from July 2002 to March 2016, although not significantly.

Figure 4-6Timeseries of bias for CDR-2 against Merged Microwave over global ice-free ocean surfaces. The linear fit shows the full overlapping period (red) and the period excluding the OLCI data (green). Dashed vertical lines mark break points detected by the PMF test.

Figure 4-7The same as in Figure 4-6 but for CDR-2 against AIRS (top) and ERA5 (bottom) over global surfaces.

Table 4-1Stability estimates for CDR-1 and CDR-2. The full period covers July 2002 – December 2017. Note that AIRS v6 covers the period September 2002 – September 2016. The recommended period covers July 2002 – March 2016. Marked in green is where the stability is significantly better than the target product requirement of 0.2 kg/m2/decade.

Surface type

Reference dataset

Stability [kg/m²/decade] (full overlap period)

Stability [kg/m²/decade] (recommended period)

Land

AIRS

\( -0.39 \pm 0.09 \)

\( -0.35 \pm 0.09 \)

Land

ERA5

\( -0.39 \pm 0.09 \)

\( -0.22 \pm 0.09 \)

Ice-free ocean

Merged Microwave

\( 0.18 \pm 0.04 \)

\( 0.18 \pm 0.05 \)

Global

AIRS

\( 0.08 \pm 0.06 \)

\( 0.09 \pm 0.08 \)

Global

ERA5

\( 0.09 \pm 0.05 \)

\( 0.13 \pm 0.08 \)

Land

ERA5 clear-sky, 10 LT

\( -0.11 \pm 0.02 \)

\( 0.02 \pm 0.02 \)

The analysis of the homogeneity is described in detail in Chapter 5 [D1] and Chapter 3 [D2].

4.4 Assessment of consistency/uncertainty

For the Merged Microwave reference dataset, an uncertainty estimate was extracted from the spread of the Merged Microwave ensemble containing 50 ensemble members. The standard deviation is used as the statistical measure for the spread. Figure 4-8 shows the results of the consistency analysis from CDR-2 against Merged Microwave over global ice-free ocean. Most of the values are below the k=1 line and all values are below the k=3 line, which indicates an overestimation of the uncertainties, except at small total uncertainties.

Figure 4-8Immler inequation (Immler et al., 2010) plotted for consistency analysis between CDR-2 and Merged Microwave over ice-free ocean. The middle of the bar indicates its mean value of the total uncertainty.

4.5 Compliance with product requirements

The validation results of TCWV for all reference data records are shown in Table 4-1 (see also Table 7-1 in the CM SAF Validation Report [D1]) for each surface type. The one named “Global land and ocean” means that also regions over sea-ice and coasts are considered. Please note, that for completeness reasons the results for CDR-2 Land (CDR-1) are also listed in Table 4-1.

GOME Evolution provides standard deviation as uncertainty estimate only while ARSA data does not contain any uncertainty estimate, therefore, both are not considered in the consistency analysis (shown as n/a).

The absolute maximum biases are between CDR-1 and ARSA (-1.7 kg/m²) and between CDR-2 over ice-free ocean and AIRS (0.9 kg/m²). The largest cRMSD is observed between CDR-2 over global land and GOME Evolution (3.32 kg/m²). The smallest bias and cRMSD is found between CDR-2 over ice-free ocean and Merged Microwave and ERA5. The low quality of the results over coastal and sea-ice regions was to be expected (larger cRMSD on a global scale than in the global land and ocean analysis).

The results of Table 4-2 are compared to the values of the product requirements (see Table 3-1 in Section 3). An overall good agreement is observed for the bias, except for CDR-1/ERA5, CDR-2 over ice-free ocean/AIRS and CDR-2 over land and ocean/GOME Evolution. The cRMSD meets all target requirements, besides the comparison with GOME Evolution. The target requirement for the consistency for CDR-2 over ice-free ocean is fulfilled, whereas for all other reference datasets the target requirements are not reached, even it is very close. Please find further discussion about uncertainties in Chapter 7 in [D1].

The threshold requirement for stability is fulfilled for all datasets. The target requirement met mainly for CDR-2 over global and over ice-free ocean.

Table 4-2Summary of statistics and uncertainty assessment for CDR-1 and CDR-2. Green highlighted values mean the bias estimate is significantly smaller than the target product requirement. For consistency the green highlighted values mean it is close to the expected value (see also Table 7-1 in [D1]).

CDR-1 & CDR-2 surface type

Reference dataset

Bias [kg/m²]

cRMSD

[kg/m²]

\( \sigma_{total} \)

 [kg/m²]

\( RSD_{bias} \)

 [kg/m²]

Consistency (%)

 

k=1

k=2

k=3

Land

ERA5

\( -0.70 \pm 0.37 \)

2.11

0.50

1.57

31

55

70

Land

AIRS

\( -0.10 \pm 0.27 \)

1.78

0.48

1.10

81

97

99

Land

SuomiNet

\( 0.10 \pm 0.93 \)

2.85

0.73

2.0

25

50

100

Land

GRUAN

\( -0.41 \pm 1.13 \)

2.52

0.71

1.72

32

53

71

Land

ARSA

\( -1.7 \pm 1.1 \)

2.44

n/a

n/a

n/a

n/a

n/a

Land

C3S

\( -0.10 \pm 0.13 \)

1.67

0.55

0.54

55

83

94

Ice-free ocean

Merged Microwave

\( -0.00 \pm 0.10 \)

0.69

0.85

0.43

75

92

97

Ice-free ocean

AIRS

\( 0.90 \pm 0.15 \)

1.16

0.84

0.75

87

99

100

Ice-free ocean

ERA5

\( 0.72 \pm 0.08 \)

0.63

0.86

0.63

63

92

97

Ice-free ocean

C3S

\( -0.05 \pm 0.07 \)

1.50

0.77

0.54

98

100

100

Global

AIRS

\( 0.47 \pm 0.12 \)

1.66

1.14

0.64

62

69

71

Global

ERA5

\( 0.10 \pm 0.10 \)

1.86

1.18

0.60

50

65

69

Global

C3S

\( 0.05 \pm 0.07 \)

1.50

0.77

0.54

82

94

98

Global

GOME Evl

\( 0.60 \pm 0.18 \)

3.32

n/a

n/a

n/a

n/a

n/a

Global land+ocean

AIRS

\( 0.60 \pm 0.11 \)

1.49

0.75

0.63

85

98

100

Global land+ocean

ERA5

\( 0.23 \pm 0.10 \)

1.50

0.76

0.56

52

80

88

Global land+ocean

C3S

\( 0.06 \pm 0.08 \)

1.12

0.77

0.50

84

95

98

Global land+ocean

GOME Evl

\( 0.77 \pm 0.18 \)

3.24

n/a

n/a

n/a

n/a

n/a

References

Immler, F. J., Dykema, J., Gardiner, T., Whiteman, D. N., Thorne, P. W., and Vömel, H.: Reference Quality Upper-Air Measurements: guidance for developing GRUAN data products, Atmos. Meas. Tech., 3, 1217-1231, doi:10.5194/amt-3-1217-2010, 2010.

Loew, A., Bell, W., Brocca, L., Bulgin, C., Burdanowitz, J., Calbet, X., Donner, R., Ghent, D., Gruber, A., Kaminski, T., Kinzel, J., Klepp, C., Lambert, J.C., Schaepman-Strub, G., Schröder, M., Verhoelst, T. (2017): Validation practices for satellite based earth observation data across communities. Rev. Geophys. 55 (3), 779-817, https://doi.org/10.1002/2017RG000562.

Merchant, C. J., Paul, F., Popp, T., Ablain, M., Bontemps, S., Defourny, P., Hollmann, R., Lavergne, T., Laeng, A., de Leeuw, G., Mittaz, J., Poulsen, C., Povey, A. C., Reuter, M., Sathyendranath, S., Sandven, S., Sofieva, V. F., and Wagner, W. (2017): Uncertainty information in climate data records from Earth observation, Earth Syst. Sci. Data 9, 511-527, https://doi.org/10.5194/essd-9-511-2017.

Reeves, J., Chen, J., Wang, X.L., Lund, R. and Lu, Q. (2007): A review and comparison of changepoint detection techniques for climate data. J. Appl. Meteorol. Climatol. 46, 900–915.

Stengel, M., Stapelberg, S., Sus, O., Schlundt, C., Poulsen, C., Thomas, G., Christensen, M., Henken, CC., Preusker, R., Fischer, J., Devasthale, A., Wille, U., Karlsson, KG., McGarragh, GR., Proud, S., Povey, AC., Meirink, JF., Feofilov, A., Bennartz, R., Bojanowski, JS. & Hollmann, R. (2017). Cloud property datasets retrieved from AVHRR, MODIS, AATSR and MERIS in the framework of the Cloud_cci project. Earth System Science Data, 9(2), 881-904.

Wang, X.L. (2008a): Penalized maximal F test for detecting undocumented mean shift without trend change. J. Atmos. Ocean. Technol. 25, 368–384.

Wang, X.L. (2008b): Accounting for autocorrelation in detecting mean shifts in climate data series using the penalized maximal t or F test. J. App. Meteor. Climatol. 47, 2423–2444.

This document has been produced with funding by the European Union in the context of the Copernicus Climate Change Service (C3S), operated by the European Centre for Medium-Range Weather Forecasts on behalf on the European Union (Contribution Agreement signed on 22/07/2021). All information in this document is provided “as is” and no guarantee of warranty is given that the information is fit for any particular purpose. The users thereof use the information at their sole risk and liability. For the avoidance of all doubt, the European Commission and the European Centre for Medium-Range Weather Forecasts have no liability in respect of this document, which is merely representing the author’s view.

Related articles