Contributors: Axel Andersson1, Tina Leiding1, Ludwig Lierhammer1, Richard Cornes2, Elizabeth Kent2, Joseph Siddons2, John Kennedy, Peter Thorne3, Corinne Voces3, Paul Poli6
Former contributors: Stavroula Biri2, Thomas Cropper2, David Berry2, Irene Pérez González2, Beatriz Recinos Rivas2, Kate Willet4, Chris Atkinson4, Eric Freeman5, Huai-min Zhang5
Issued by: NUIM / Peter Thorne
Date: 31/10/2025
Ref: C3S2_D311_Lot1.2.3.2.2025_Tenth_version_Marine_User_Guide
Marine_User_Guide_v10
Official reference number service contract: 2021/C3S2_311_Lot1_NUIM/ C2
1 Deutscher Wetterdienst (DWD) 2 National Oceanography Centre (NOC) 3 National University of Ireland Maynooth (NUIM) 4 Met Office UK 5 National Oceanic and Atmospheric Administration National Center for Environmental Information (NOAA NCEI) 6 ECMWF |
The service would like to acknowledge in kind contributions by NOC and NOAA NCEI and their valuable and continuing input and support to the scientific and technical development, improvement and production of the data sets released through this service.
The C3S2 311 Lot 1 GLAMOD (Global Land and Marine Observations Database) service is concerned with the provision of globally available land and marine surface meteorological records. The service includes inventorying of, and brokering access to, data sources, their harmonization (via conversion to a Common Data Model, merging, and quality assurance) and their provision via the Copernicus Climate Change Service Data Store (CDS).
This marine user guide describes all relevant aspects of the marine data service necessary for a user to access and work with the marine data appropriately and with confidence. This document does not constitute a technical service document which instead is available separately. There is also an accompanying land user guide that describes the land data and its processing.
This is a living document which shall be subject to regular revision to reflect the status of the service at any given point in time. Releases of this document will always accompany new data releases. On an exceptional basis, as warranted, additional releases may occur to clarify issues raised by users or document important changes between releases such as any modification to the modalities of data access. Feedback on the adequacy and completeness of the document is welcomed at any time. Feedback should be provided via the C3S helpdesk facility to allow C3S tracking of user input. All such feedback shall be passed on in full to the service team for due consideration.
Former versions are archived and available upon request. The version history is given below:
Version | Release Date | Release notes |
0.0 | 31/08/17 | Initial version consisting of section outlines and description of initial archiving material |
1.0 | 14/12/17 | Updates to reflect the status of the test release |
2.0 | 17/12/18 | Update to reflect status of beta release |
3.0 | 14/10/19 | Update for first data release |
4.0 | 21/07/20 | Update after second data release |
5.0 | 29/03/21 | Update after third data release |
6.0 | 03/08/21 | Update after fourth data release |
7.0 | 12/06/23 | Update after fifth data release |
8.0 | 30/08/23 | Update after sixth data release |
9.0 | 27/11/24 | Update after seventh data release |
10.0 | 31/10/25 | Update after eighth data release |
The C3S2 311 Lot 1 - Global Land and Marine Observations Database (GLAMOD) service provides access to in situ surface observations from various climate archives in a common data model. This document describes the marine component of the service, providing access to surface marine meteorological observations. This component is currently based on Release 3.0 and Release 3.0.2 of the International Comprehensive Ocean-Atmosphere Data Set (ICOADS, Freeman et al. [2017]; Liu et al. [2022]). Starting with release 7, we include reprocessed drifting buoy data provided by the C-RAID project (Zunino Rodriguez et al. [2025]) as an additional input data source.
The input data has been enhanced by:
The observations in ICOADS come from a range of observing platforms, before 1980 these are primarily ships but with an increasing number of measurements made by sensors installed on moored and drifting buoys after this date. The ship observations are clustered over the shipping routes prevalent at the time of observation. Coverage by drifting buoys tends to be more dispersed but with lower sampling over regions of ocean upwelling. Moored buoy data, with the exception of the tropical arrays, tend to be more coastal and concentrated around North America and Europe.
Each meteorological report in ICOADS typically contains observations of multiple Essential Climate Variables (ECVs, e.g. Bojinski et al. [2014]) made at the same time and location, for example coincident measurements of the air temperature, humidity, wind speed and direction, sea surface temperature and sea level pressure. From the mid 2000s the record is dominated by reports from drifting buoys but these only report a small subset of the ECVs included in this service, typically only sea level pressure or sea surface temperature. Starting with this release, the full C-RAID drifting buoy dataset is included in this service. The C-RAID dataset contains fully reprocessed and scientifically checked data and metadata of about 20,000 drifting buoys since 1979. Currently the C-RAID data set includes only ARGOS type drifting buoys, missing Iridium type buoys from recent years. As the C-RAID data set replaces all drifting buoy data included in the ICOADS data set, this may lead to a partly lower data coverage compared to previous releases, which is expected to improve with future C-RAID data set versions.
The reporting frequency ranges from hourly or more frequent in the recent period to daily observations prior to 1860. In the intervening period observing frequency ranged from three times per day based on the watches on board ships to 3-hourly or 6-hourly observations.
Several sources of metadata have been used, for the period after 1982 metadata from WMO-No. 47, the ”List of Selected, Supplementary and Auxilary Ships” (e.g. Kent et al. [2007]) has been merged with the ship observations. Some ship observations prior to 1982 have also been merged with the WMO-No. 47 metadata but this is hampered by a lack of ship callsigns within the data before this date. Additional metadata has been extracted from the ICOADS Supplemental Data (see Freeman et al. [2017]) and documentation available on the ICOADS website.
The C-RAID drifting buoy data set contains metadata that has been merged from sources provided by the NOAA AOML Global Drifter Program and the satellite service provider CLS (Collecte Localisation Satellites) (see Rannou et al. [2024] for details). Metadata included in the C-RAID data set has been merged into this data product, where applicable.
Apart from the summary information presented in this document, a detailed overview of the data contents for each of the input sources is available in the Marine Data Inventory (C3S2_D311_Lot1.2.3.1.2025_Updated Inventory for Marine datasets_v5).
Table 1 lists the Essential Climate Variables included in the current release. Table 1 also summarises the principal observing methods and changes over time for those ECVs. Early observations (< 1950s) were typically recorded in whole units, for example whole degrees Celsius or Fahrenheit depending on national practice. In some cases the data were recorded to higher precisions but truncated or rounded when digitised (e.g. Chan et al. [2019]) or shared operationally. This rounding for operational data streams persisted until the 1980s (e.g. Willett et al. [2008]). Later observations were typically recorded and shared with precisions of tenths or higher. As part of the processing to create ICOADS (and earlier datasets) the observations were converted to standardised units (°C for temperatures, hPa for pressure and m/s for speeds) and uniform resolution, typically to one or two decimal places. These have been further processed as part of the GLAMOD service and fully converted to S.I. units. Where recorded, the originally reported value and units of the observations can be traced back to the original input data. There are known issues with the marine meteorological observations, these are discussed further in Section 2. No corrections or adjustments have been applied as part of this service, instead the user is directed to the literature identified within Section 2.
The temporal coverage of the data available in the current release spans the period 1850 - 2024, Figure 1 shows the coverage per ECV. Before the Brussels International Maritime Conference of 1853 the most common ECVs to be reported were: air temperature; wind speed / force and direction; and sea level pressure, with typically 100 - 200 observations per month. A small number of sea surface temperature and humidity observations are available (< 100). Following the Brussels conference there is a large increase in number of observations to O(10,000) per month for all parameters excluding dew point temperature. Dew point observations only reach 10,000 per month on a routine basis around the time of the First World War. For all parameters, the number of observations increases over time to around 100,000 per month in the 1960s and remains at this level until the 1990s. From the mid 1990’s there is a large increase in the number of sea level pressure and sea surface temperature observations from drifting buoys. The C-RAID drifting buoy data record currently ends in early 2019, resulting in a
drop in the number of sea surface temperature and sea level pressure observations.

Figure 1: Availability of ECVs listed in Table 1 by time. Both the number of observations (red) and number of grid cells (black) with data are shown. Note log scale for the number of observations.
Figure 2 shows the total number of reports and total number of months per 1x1 degree grid cell, however over the period 1850 - 2024 there have been major changes to the spatial sampling. Early in the record there were typically a few hundred 1x1 degree grid cells with data per month (see Figure 1). These tended to be concentrated in the Atlantic Ocean and around the southern capes to the Indian and Pacific Oceans. With the opening of the Suez and then Panama canals the major shipping routes changed, with less shipping transiting the capes and a greater proportion transiting the Mediterranean and Caribbean seas to reach the Indian and Pacific oceans.
After a strong decline during second World War, spatial coverage reached a peak between the 1960s and 1990s but declined sharply after 1990. The number of grid cells dropped from around 20,000 to around 10,000 - 15,000 by 2020 for the majority of parameters. The contrast between the stable number of observations and decreasing spatial coverage (contrasting red and black lines in Figure 1) is due to the change in reporting frequency from 6 hourly to 3 or 1 hourly and greater over the past 30 years, increasing the temporal sampling at the cost of spatial sampling. Sea level pressure and sea surface temperature experience a much smaller marked decline in spatial coverage due to the contribution of drifting buoys. Coverage of SST and sea level pressure is expected to increase with the inclusion of additional drifting buoys in future releases.
Figure 2: Spatio-temporal distribution of marine data holdings 1850 - 2024. Left: number of reports per grid cell; right: number of months with at least 1 report. Grid size is 1x1. Note logarithmic colour scale. Only reports passing the report level QC checks flag have been included.
Table 1: Essential Climate Variables included in the GLAMOD service
Variable | Description |
Air temperature | Air temperature at observation height, ranging from 4 - 5 m in the 19th century to over 30 m for the modern period. Typically measured using liquid in glass thermometers (mercury or alcohol) but with a transition to electronic sensors over the past 30 years. |
Dew point temperature | Dew point temperature at observation height, usually the same as for air temperature. Typically calculated from wet and dry bulb thermometers housed in marine screens or whirling psychrometers. Over the past 30 years there has been a transition to electronic sensors measuring relative humidity. |
Sea surface temperature | Water temperature near the sea surface measured using a variety of methods. This temperature does not represent the skin sea surface temperature. Depending on the method, the temperature is measured in depths from a few centimeters to several meters. Typically bucket measurements pre 1950 but with a transition to engine intake measurements beginning around 1900. Dominated by drifting buoy observations after ∼2003. |
Sea level pressure | Atmospheric pressure reduced to sea level and corrected for temperature where appropriate. Methods range from mercury barometers in the 19th Century, through aneroid barometers to electronic sensors. It should be noted that the sea level reduction and temperature corrections are applied at the time of observation. |
Wind speed | Wind speed at observation height. As with air temperature and dew point temperature, the observation height has varied from < 10 m in the early record to more than 30 m for recent years. Methods include visual wind speed estimation for the early record transitioning to mechanical sensors (cup/propellor anemometers) and electronic sensors (sonic anemometers). Instrumental observations corrected for ship speed and course (not applicable for visual estimation). |
Wind Direction | See wind speed. |
The data can be accessed and downloaded via the Copernicus Climate Change Service (C3S) Climate Data Store (direct access link: DOI:10.24381/cds.27f643d7). Users will need to have registered/logged-in, and accepted the terms of use in order to be able to submit the final data download request form.
The page has 3 tabs as illustrated in Figure 3.
Figure 3: Screenshot of the dataset entry tabs.
The Overview tab presents an overview of the land data and provides details on the data description, main and related variables available, contact email and links to license / data policy statements.
The Documentation tab provides links to the present Marine User Guide, the Common Data Model, and a link to the Data Deposit Server webpage.
The Download tab provides several boxes where specific selections need to be ticked, as follows:
Figure 4 shows an example of selection for the top part of the download form. After selecting Air pressure at sea level, Water temperature, and Wind speed, the selection of a particular year (1850) causes one of the variables (Dew point temperature) to be greyed out. This indicates to the user that such data are not available for that year, however, the other variables are available.
Figure 4: Screenshot of the upper part of the download tab.
Once a selection has been made, and provided the user is logged in, the first query will be ready to submit, requiring to review the terms of use. If these are accepted, then it is possible to select 'Submit form'. An example is shown in Figure 5.
Figure 5: Lower part of the download tab, once (1) a complete selection has been made, (2) the user has logged in, and (3) and the terms of use have been accepted.
If a user would like to use the Application Programming Interface (API) in the CDS (Please go to the documentation page for information as to how to use the CDS API) there is the option to view and copy the current data API request by clicking on "Show API request code" at the bottom of the page (Figure 5).
Figure 6: As above, after clicking on "Show API request code". In this example, air pressure at sea level, water temperature, and wind speed are requested for the whole year 1850, for version 2.0.0 of the dataset, in NetCDF format.
Once the data are returned, they can be inspected with a spreadsheet software (if in CSV), or using other processing solutions to load the data into more advanced data structures.
One example is shown in Figure 7, loading the data into a pandas data frame in python, and then inspecting the first 4 entries.
Figure 7: Example of instructions in python to read the data. This example shows the first four entries found for the selection. The traceability to the sources (ICOADS) is clearly labelled. Please refer to the Overview tab for a succinct description of each element, and to section 1.5 for more details.
Section 1.5 provides details of each element field in the downloaded data. Users are instructed to give particular attention to the following two elements, contained in the data returned. First, the data policy license informs about conditions of use (see this table for the correspondence). Second, the source identifier informs about the data product and/or provider (sometimes including one or several citations) that need to be acknowledged when using the data (see Table A in the Appendix). It is important that users follow these requirements to ensure usage is commensurate with licence conditions and that proper acknowledgement is given to the ultimate data rights holders.
Data from the Climate Data Store are returned in NetCDF (.nc) or delimited text files (.csv), with 20 columns per row and one observed value per row. The columns are listed in table 2 together with a brief description. The date and time of the observation is given by the report_timestamp column, with the value returned as a string of the format YYYY-MM-DD hh:mm:ss. All data are returned in the UTC time zone. The location of the observation is given by the latitude and longitude columns in the WGS84 / EPSG4326 coordinate reference system. The variable being reported / observed is given by the observed_variable column, the units by the units column and the observed value by the observation_value column. The remaining columns provide further contextual information for the observations, such as platform type and station identifiers etc.
Table 2: Field name, data type and description of the different elements as per CDM-OBS-CORE
Element grouping | CDM-OBS-CORE Element | Type | CDM-OBS primary Table | Description |
Identifier information | station_name | varchar | Station configuration | The station name (where station can mean a physical station, a ship, a buoy or any other observing platform) |
primary_id | varchar | Station configuration | The primary station identifier for the station / platform from which the observation arises. For Marine Stations this can be the e.g. the SOT-ID, ship's call sign, buoy WMO Number or another identifier for historical observations. See section 3.2 for details. | |
report_id | varchar | Header | Report identifier unique per report (collection of observations). For marine observations this has been set as <dataset><version><uid>, e.g. ICOADS-30-00J771 for a UID 00J771 from ICOADS release 3. | |
observation_id | varchar | Observations | Unique observation identifier per observation. For marine observations this has been set as <dataset><version><uid><field>, e.g. ICOADS-300NBVN1-AT for an air temperature observation with UID 0NBVN1 from ICOADS release 3. | |
Location information | longitude | numeric | Header | Location of instrument at time of observation (identical to entry in station configuration table for fixed assets). Longitude in degrees East, range -180 to 180. |
latitude | numeric | Header | Latitude in degrees North, range -90 to 90. | |
height_of_station_above_sea_level | numeric | Header | Altitude in meters. 0 for stations located at mean-sea-level. | |
Temporal information | report_timestamp | Timestamp with timezone | Header | Date timestamp including timezone. The default for presentation of data via C3S should be that all data have been converted to UTC. |
report_meaning_of_timestamp | int | Header | Whether the timestamp refers to (1) beginning, (2) end, or (3) middle of reporting period. Defined in corresponding CDM-Obs code table. | |
report_duration | int | Header | The duration of the report. Defined in corresponding CDM-Obs code table. | |
Observation value information
| observed_variable | int | Observations | The variable being observed defined by a numeric identifier |
units | varchar | Observations | The units associated with the observed variable | |
observed_value | numeric | Observations | The observed value | |
observation_height_above_station_surface | numeric | Observations | Height of the sensor above local ground or sea level in metres. Positive values for observations above the surface. For visual observations, height of the observing platform. | |
Quality information | quality_flag | int | Observations | The quality flag for the observation. 0 indicates the observation passed checks, 1 failed check, 2 not checked, etc. Defined in corresponding CDM-Obs code table. |
Source information | source_id | Varchar(pk) | Source configuration | Data source identifier – for provenance. If mixed source collection must be a data column. Please refer to Appendix Table A for the full list pertaining to the particular release. |
data_policy_licence | int | Source configuration | Data policy per observation. 0 indicates "Open", with description "Data in public domain and freely available (no cost and unrestricted)", 1 indicates "WMO essential" (with a further description), etc. Defined in corresponding CDM-Obs code table. | |
Type of observation | report_type | int | header | Type of observing platform, report or instrument for applications with mixed holdings where in CADS user subsetting may be advantageous. 0 indicates Subdaily / hourly data. Defined in corresponding CDM-Obs code table. |
platform_type | int | header | Indicates the type of structure upon which the sensors are mounted, e.g. 2 for ship, 4 for moored buoy, 5 for drifting buoy, etc. Defined in corresponding CDM-Obs code table. | |
value_significance | int | Observations | An indicator of what the value signifies (0 maximum, 1 minimum, 2 mean, etc.). Defined in corresponding CDM-Obs code table. |
Weather observations have routinely been made at sea since at least the 1650s, with descriptive observations of the prevailing weather conditions and early instrumental measurements included in the general ships logbooks as part of the daily entries. Following the Brussels Conference of 1853, and recognition of the importance of the weather observations to international trade and safe navigation, standardised meteorological logbooks and observing instructions were developed and made available to ships captains. In return for following the instructions and returning the completed logbooks the captains were provided with the latest sailing directions and weather charts, thereby collectively gaining benefit from the collecting and sharing of meteorological data. This international coordination, standardisation and data sharing has continued to the present day, currently overseen by the World Meteorological Organisation (WMO) and Global Ocean Observing System (GOOS) (e.g. see Smith et al. [2019]). Observations from other platforms, such as moored and drifting buoys, are similarly coordinated internationally.
Through the international coordination and data sharing, national weather services independently developed global archives of marine meteorological and oceanographic observations. Many of these archives contained overlapping data. For example, the US Navy Marine Atlases developed after the Second World War contained data from both the US archives and data from, inter alia, the UK, German and Netherlands archives that had been shared internationally. This resulted in increased numbers of observations but at the expense of having to perform duplicate detection and elimination. In recognition of the importance of historic data to understanding climate variability, building on the earlier Marine Atlases, the US National Oceanic and Atmospheric Administration (NOAA) developed the Comprehensive Ocean - Atmosphere Data Set (COADS) in the early 1980s. The first version was published in 1985 (e.g. see Woodruff et al. [1987]), consisting of both monthly summary statistics for selected ECVs and the raw weather reports from ships and other vessels used as input to the monthly summaries. Each weather report contained coincident measurements of multiple ECVs (typically air temperature and humidity, wind speed and direction, sea level pressure and sea surface temperature) together with visual estimates of the sea state, wind force (when not measured), cloud cover and weather. This first version formed the largest and most comprehensive archive of marine meteorological observations publicly available at the time.
Building on the first version, COADS continued to be developed with coverage extended to present day through major updates and reprocessing and through incremental near real time updates. As part of this development, and the evolving observing system, COADS has expanded from primarily ship based observations to surface meteorological measurements from most marine platforms shared internationally. Examples include measurements from moored and drifting buoys, coastal stations and offshore platforms, all shared internationally in real time over the WMO Global Telecommunication System (GTS) and in delayed mode through Global Data Assembly Centres. The development has also included blending COADS with other national archives and delayed mode data sources, such as the UK Met Office marine data bank. In recognition of the importance of the international contributions COADS was renamed as the International Comprehensive Ocean - Atmosphere Data Set (ICOADS) in 2002 (e.g. Worley et al. [2005]). The current major release of ICOADS, release 3.0, was published in 2017 (Freeman et al. [2017]), releasing observations up to the end of 2014. Near realtime updates for the period after 2014 are available from ICOADS Release 3.0.2 (Liu et al. [2022]). The ICOADS near realtime updates contain regular releases of marine observations. The near realtime observations are based on blended marine observations in the TAC and BUFR formats from NOAA’s National Centers for Environmental Information (NCEI) GTS collections.
As noted above, due to the combination of different sources over many decades, including from different national archives, and the long tradition of data sharing in marine meteorology there are many duplicates in the raw data. The publicly released version of ICOADS has attempted to remove or combine duplicates when detected, but this has not always been optimal due to choices made over the history of ICOADS. For example, some duplicates are missed due to one archive recording the latitude / longitude in whole degrees and another archive storing the same location information but recording the grid box centres (i.e. with a 0.5° trailing digit). These decisions have been revisited as part of this service, with the available ICOADS source files (known as ”total” files within the NCEI/ICOADS archives) reprocessed and the data flagged appropriately. Note that duplicates identified prior to ICOADS release 2.5 are not currently available in the total files, but could potentially be reprocessed from the original data sources, should resources become available.
This processing is described in Section 3. The original source of the data is recorded in ICOADS with the data accessions indexed by ”deck” (DCK) and ”source” (SID) identifiers (Woodruff et al. [1987], Freeman et al. [2017]). A detailed summary of the data accessions is given in the marine inventory (C3S2_D311_Lot1.2.3.1.2025_Updated Inventory for Marine datasets_v5) available from the CDS website.
Drifting buoy data is available from different sources and collections based on data received from the GTS or directly from the service providers receiving the transmissions. As a part of the Global Ocean Observing System (GOOS), NOAA’s Global Drifter Program (GDP) manages the collection, analysis, and distribution of the data acquired by drifters. Data and metadata collected by drifting buoys are publically available in near real-time via the Global Data Assembly Centers (GDACs) operated by Coriolis-Ifremer (France) and MEDS (Canada), who apply an automated quality control (QC) to the data. In the longer term, scientifically quality controlled delayed mode data will be distributed on the GDACs.
Such a delayed mode data set is prepared by the Copernicus Reprocessing of Argos and Iridium Drifters (C-RAID) project. The objectives of the C-RAID project are to gather, decode and process all the historical data measured by drifting buoys starting in 1979 and to make the data publicly available (Zunino Rodriguez et al. [2025]). For the current release we started to include the ”C-RAID 202412 winter delivery” data set in GLAMOD. So far, only drifter data with Argos positioning have been included in the C-RAID database. Since 2016, most drifters are using Iridium for data transmission and Global Navigation Satellite System (GNSS) for positioning, whereas most earlier platforms used Argos for both functions. The C-RAID ”202412 winter delivery” contains quality controlled data of 16 965 drifters deployed between 1979 and 2019. The following drifters are removed since they are duplicates of other drifters: 1210497, 1210607, 1190196 and 1250594. In total, 17 491 drifters are used in the current release. Current input data sources for C-RAID are mainly original Argos messages received by Collecte Localisation Satellites (CLS), but also data from the GDP. Metadata records were also provided by CLS and the GDP. Both groups of drifter data will be included in future releases C-RAID (Rannou et al. [2024]). The addition of Iridium drifters will complement the data set and extend it beyond 2019.
Overall, the CLS database appears to contain more Argos Ids than the GDP drifter Argos messages, resulting in an additional 2,000 buoy Ids in C-RAID compared to the GDP. The reprocessing applied in the C-RAID data set includes decoding of the original Argos messages, ingestion of meta data, recalculation of the positions using Kalman filtering and an advanced quality control with an land/beached test.
As part of the development of this service the input sources and decks contributing observations to ICOADS have been prioritised and reprocessed based on the volume of observations, spatial / temporal coverage and availability of ECVs. Starting with GLAMOD release 7, drifting buoy data from the C-RAID project was added to the data set.
Release 1 focussed on the post World War II period (1951 - 2010) and included observations from drifting buoys (sea surface temperature and sea level pressure) and Voluntary Observing Ships (VOS) (air temperature, dew point temperature, wind speed and direction, sea level pressure and sea surface temperature). As part of the first release VOS observations were merged with instrumental metadata from WMO Publication Number 47 (e.g. Kent et al. [2007]) to provide information on observing heights, instrument types / observing methods and information on the vessels contributing the observations.
Release 2 sought to extend the record backwards to 1851, including all available ship observations prior to 1951.
Release 3 updated the record to the end of 2014.
Release 4 updated the record to the end of 2020 using the observations from ICOADS Release 3.0.1, including a full reprocessing of the data from the previous releases. As part of this reprocessing additional metadata has been recovered for some of the earlier records and the observations for the period.
Release 5 updated the record to the end of 2021 using the observations from ICOADS Release 3.0.2, adding near realtime data updates including decoded BUFR messages from 2015 onwards.
Release 6 updated the record to the end of 2022 using the observations from ICOADS Release 3.0.2, migrated the data processing to the Irish Center for High-end Computing and fixed a bug where the additional metadata was not added in the past. The metadata has been added to the previous releases as a post-processing step. Drifting buoy data from the ICOADS Release 3.0.2 are not included in this release.
Release 7 updated the record to the end of 2023 using observations from ICOADS Release 3.0.2 and adding C-RAID drifting buoy data. Additionally, major improvements in the mapping routines, a new duplicate detection scheme and a wind QC routine were implemented. Data was reprocessed for the NRT period from 2015 onwards, benefiting from these improvements. A full reprocessing of the historical data holdings including ingestion of the entire C-RAID record is planned for the next release.
Release 8 now includes the entire C-RAID drifting buoy data set, starting in 1979. It replaces all drifting buoy data included in the ICOADS data set. Furthermore, the QC has been completely revised, resulting in a more efficient and consistent workflow. The data set was completely reprocessed with the new QC routines as well as updated PUB47 meta data and NOC corrections.
The marine inventory (C3S2_D311_Lot1.2.3.1.2025_Updated Inventory for Marine datasets_v5, available from the CDS website) gives a detailed overview of the contents from each data source used in this release.
Figure 8 shows the monthly number of reports available in the current release. Only those passing the full quality control check are shown, in total 403,003,449 reports from 249 sources have been processed. Of these 313,724,423 are unique and pass the report quality control checks (see Section 3), the remainder either fail the duplicate check or the aggregate report level QC check (see Section 3). Figure 9 shows the spatial sampling by total number of reports and by number of months. The total number of reports is dominated by the recent decades and the increase in buoy sampling (see Figure 10). The contribution to the ship based reports can be seen in the number of months available, with
the ship tracks clearly visible.
Several notable sources and platform types have been excluded. Namely, fixed stations (mobile installations and rigs, coastal stations and moored buoys), meteorological observations from research vessels, and meteorological observations from oceanographic programmes and datasets such as the World Ocean Database (WOD) and the Global Ocean Surface Underway Data (GOSUD). Users requiring access to these data are recommended to access data directly from the respective datasets (e.g. WOD, GOSUD, SAMOS).

Figure 8: Monthly number of marine in situ reports. The stacked areas indicate the monthly number of ship reports (grey) and drifting buoy reports (yellow). Only reports that have passed all the quality control checks are included. Time series have been smoothed using a 12-month running mean filter. Note the logarithmic scale in the y-axis.

Figure 9: Spatio-temporal distribution of marine data holdings 1850 - 2024. Left: number of reports per grid cell; right: number of months with at least 1 report. Grid size is 1x1. Note logarithmic colour scale. Only reports passing the report level QC checks flag have been included.

Figure 10: Spatio-temporal distribution of marine reports: latitude-time (Hovmøller) plots of number of reports. Data is binned and aggregated in 1° latitude × 1 month boxes. Only reports passing the report level QC checks have been included.
Recent metadata for voluntary observing ships is stored and maintained in the OceansOps metadata database from where they are mirrored to the WMO OSCAR/Surface metadata database. An extract of the OceanOps data base that is compatible with the legacy WMO Publication No. 47 is published through https://www.ocean-ops.org/share/SOT/PUB47/. WMO Publication No. 47 was the official International List of Selected, Supplementary and Auxiliary Ships, contains metadata on the ships, and observing methods used by those ships contributing to the WMO Voluntary Observing Ships Programme. A summary of the information available in WMO Publication No. 47 can be found in Kent
et al. [2007]. This includes information on the ship dimensions and types, type(s) of instruments used, location of the sensors on board the ships and height of the sensors above the sea surface or depth below. A subset of the metadata from WMO Publication No. 47 has been included in ICOADS since Release 2.5 based on a basic match between the reports in ICOADS and the WMO Publication 47 edition coincident with the observation (Kent et al. [2007], Woodruff et al. [2011]).
As part of C3S2 311 Lot 1 the WMO Publication 47 metadata has been reprocessed, updated and contributed by NOC. Selected fields from the metadata, including ship names, types and instrument heights, have been included in this release. During the reprocessing duplicate records have been combined, reducing the number of records from around 875 000 to 255 000 over the period 1956 - 2020. Validity dates have been added, with the first record for a given ship extended by 1 year prior to its first appearance and the final record extended by 5 years. This allows for a period of time between a ship being recruited and it first appearing in WMO Publication No. 47 and a period where a ship continues reporting but is no longer registered by the original recruiting country. Limited quality control has been applied to the data, with correction of typographical errors and standardisation of entries. Similarly, heights have been corrected for known errors, such as incorrect packing of numeric data into alpha-numeric representation (e.g. i5 keyed instead of 15). In addition to WMO Publication No. 47, metadata is present in the ICOADS supplemental records for a small number of observations, including information such as ship type, barometer height, routes etc. Some information has been extracted from the ICOADS supplemental records and included in this service.
Figure 11 shows the fraction of ship observations passing quality control for which the sea level pressure observing height is known (red line). The blue shaded area indicates the 25th - 75th percentiles of the observing heights and the blue dashed line the median height. The earliest heights available are from the late 1870s for a period of ~15 years, with a significant proportion of the observations made during this period at a known height of around 5 m. There is then a large gap in the metadata until the 1960s where WMO Publication No. 47 becomes available and call signs begin to be used in the observational record allowing the metadata to be associated with the observations. However, the values before the 1970s should be used with caution, only a small fraction of the observations have been associated with the metadata, there is a large inter-quartile range and the median value is higher than expected. This is likely due to a number of uncorrected heights in feet above sea level rather than meters above sea level but further work is required to fully understand the metadata. Between 1970 and 2020 the median observation height increases from ~15 m to ~25 m.
The decrease in the proportion of observations with a known height from ~2014 is due to a decrease in the availability of call signs in the observational data due to call sign masking and use of generic call signs (such as SHIP) in response to security and commercial concerns of the participating Voluntary Observing Ships. Metadata for the air temperature, humidity and wind speed observations does not become available until the 1960s although the observing heights are expected to be similar to that for pressure. From the 1970s onwards the change in heights (not shown) shows a similar trend to sea level pressure, with the temperature and humidity heights increasing from 15 m to ~27 m. The increase in wind speed height is greater (not shown), increasing from ~15 m in the early 1970s to almost 35 m by 2020.

Figure 11: Median sea level pressure observing height (blue) and % of observations where the height is known (red). Also shown is the inter-quartile range of the observing height (blue shading).
Several sources exist for drifting buoy metadata. Recent metadata is accessible through OceanOps, OSCAR/Surface and the Global Drifter Program (GPD). For historic and reprocessed data sets additional data sources may be needed. The C-RAID drifting buoy data set contains metadata that has been merged from data file provided by NOAA AOML GDP and CLS.
Figure 12 shows climatological values of the mean air temperature and dew point temperature over the period 1850 - 2024. Figure 13 shows the number of months with at least 1 observation on a 1x1 degree lat/lon grid. The clustering over the major shipping routes is clearly visible. The air temperature and humidity measurements made on board ships have traditionally been made using liquid (mercury or alcohol) in glass thermometers (wet and dry bulb) sheltered from direct solar radiation, rain and sea spray and housed either in a fixed shelter or in a hand-held instrument. However, due to the need for the thermometers to be accessible for reading the location of the thermometers is often not ideal, with the shelters sometimes located in poorly exposed locations and with inadequate ventilation. This can lead to biases in humidity measurements due to inadequate ventilation of the wet bulb thermometers (e.g. Berry and Kent [2011]; Willett et al. [2008]). Similarly, daytime air temperature measurements can contain biases due to the warming of the ship superstructure by solar radiation, in turn warming the air and giving biased air temperature measurements (Rayner et al. [2003]). Bias adjustments have been developed to account for these effects (e.g. Berry et al. [2004]) but have not been applied as part of this service With the exception of the Second World War (e.g. see Cornes et al. [2020]; Kent et al. [2013]. Although pervasive biases have not been identified in the night time air temperature measurements or artificially ventilated wet bulb thermometers measurements during the period 1900 – 2020 there are likely to be further artifacts as the records are analysed Kent and Kennedy [2021]. However, biases have been identified in the 19th Century and earlier data due to the use of thermometers located in the captains cabin (e.g. Chenoweth [2000]).

Figure 12: Mean air temperature (left) and dew point temperature (right) over the period 1850 - 2024. All observations passing quality control have been averaged to give monthly mean values. These have then been averaged to give the long term mean.
Another factor that could lead to biases are systematic changes in the height at which the observations are made. The average observation height on merchant ships has changed substantially over the period, ranging from 4 - 5 m in the late 1800s, increasing to 30 m in the late 2000s (Figure 6, Kent et al. [2007], Kent et al. [2013]). Adjustments to a fixed reference height are required to avoid inhomogeneities in the record and leading to underestimation of trends in the data. Adjustments are typically made based on Monin-Obukhov similarity theory and the approximation of a constant flux layer near the surface (e.g. Businger et al. [1971]). Observing heights are included with the observations where known but adjustments have not been made as part of this service.

Figure 13: Number of months with at least one observation of air temperature (left) or humidity (right) over the period 1850 - 2024 on a 1x1 degree lat/lon grid. Only observations passing quality control have been included.
Figure 14 shows climatological values of the mean sea surface temperature over the period 1850 - 2024 (left) and the number of months with at least 1 observation on a 1x1 degree lat/lon grid (right). The clustering over the major shipping routes is clearly visible but with better sampling than for air temperature and humidity away from the shipping lanes due to the sampling by drifting buoys. Sea surface temperature observations have been historically made using a wide variety of methods, ranging from wooden, canvas and rubber buckets to infrared radiometers. The most common methods are the bucket based and engine intake methods, both of which suffer from biases (Kent et al. [2017]). Due to evaporative and conductive cooling the water samples in the buckets can be biased low compared to the true sea surface temperature. Conversely, due to heat from the engine room, engine intake measurements can be biased warm. Bias adjustments have been developed and implemented when constructing gridded datasets from the observations (Kennedy et al. [2011]). The quality of SST observations has been widely discussed (e.g. Kennedy [2014], Kent et al. [2017], Kent and Kennedy [2021], Sippel et al. [2024]). As with the other parameters, no adjustments have been made to the observations to account for known biases as part of this service.

Figure 14: Mean sea surface temperature (left) and number of months with at least one observation (right) over the period 1850 - 2024. All observations passing quality control have been averaged to give monthly mean values. These have then been averaged to give the long term mean.
Figure 15 shows climatological values of the mean wind speed and direction over the period 1850 - 2024. Figure 16 shows the number of months with at least 1 observation on a 1x1 degree lat/lon grid. As with the other parameters, the clustering over the major shipping routes is clearly visible. Wind speed observations made on board ships were historically made by estimating the wind force and recording the estimate using the Beaufort scale. This provides estimates of the upper, lower and midpoint wind speeds for each value on the scale at a reference height of 10 m. When Beaufort scale wind estimates have been converted to a speed the mid-point has typically been used. More recently, the visually estimated wind speeds have been estimated as a speed (e.g. knots or m/s) using the Beaufort scale as a guide.
Through comparison with instrumental measurements and co-located observations the original Beaufort scale has been shown to be biased and corrections proposed (Kent and Taylor [1997]). Over the past several decades there has been an increasing move to using anemometers to observe and report the wind speed over the oceans. These measurements can contain biases due to the impact of flow distortion on the wind speed (Moat et al. [2005]). As with the air temperature and humidity, the typical height of wind speed measurement has changed with time (e.g. Thomas et al. [2008]) and the measured values require adjustment to a common reference height. Again, no adjustments have been made as part of this service.

Figure 15: Mean wind speed (left) and direction (right) over the period 1850 - 2024. All observations passing quality control have been averaged to give monthly mean values. These have then been averaged to give the long term mean.

Figure 16: Number of months with at least one observation of wind speed (left) or wind direction (right) over the period 1850 - 2024 on a 1x1 degree lat/lon grid. Only observations passing quality control have been included.
Figure 17 shows climatological values of the mean sea level pressure over the period 1850 - 2024 (left) and the number of months with at least 1 observation on a 1x1 degree lat/lon grid (right). The clustering over the major shipping routes is clearly visible and, as with the sea surface temperature, better sampling away from the shipping lanes due to the sampling by drifting buoys can be seen.
Early pressure observations were typically made using mercury barometers, with a transition to marine aneroid barometers in the mid 20th century and to electronic instruments more recently. The different sensors each have corrections that need to be applied but these are typically applied either at the time of reporting or automatically prior to the values being read in the case of the electronic sensors. The mercury and aneroid barometers require correction for temperature and height above sea level, with an additional adjustment for gravity required for the mercury barometers. These corrections are believed to have been applied in the ICOADS data. When looking at long term means / climatological values the observations also need to be corrected for diurnal and semi-diurnal oscillations (e.g. Ansell et al. 2006). These additional adjustments have not been applied as part of this service.

Figure 17: Mean sea level pressure (left) and number of months with at least one observation (right) over the period 1850 - 2024. All observations passing quality control have been averaged to give monthly mean values. These have then been averaged to give the long term mean.
All data processing and published in this service is handled in the common data model (CDM) used by the GLAMOD service for both land and marine data. Marine input data is converted to the CDM, using the Common Data Model reader and mapper (CDM reader/mapper) software package (Lierhammer et al. [2025c]). The package is capable of reading in text and NetCDF data records, parsing the contents according to description tables and converting them to the CDM as defined in specific rule sets. During the conversion process, a series of consistency and format checks are applied to the data, allowing corrupt records and systematic problems with the raw data to be identified and excluded from further processing.
Each element has a defined data type and valid range. Similarly, valid enumerated (or coded) values are specified via a code table. Any record with any element falling outside the specified valid range, with the exception of missing data, or with any invalid coded value is considered invalid. Similarly, reports with an invalid date, time or location are excluded from further processing.
The source data for ICOADS is stored in the fixed width International Maritime Meteorological Archive (IMMA) format. This format consists of one weather report per record with a core data section and optional attachments containing additional information. Table 3 summarises the different attachments, those in italic are typically reported with each report regardless of source. The presence of the other attachments depends on both the type of station/- platform making the weather report and the source of the data. For example, the IMMT attachment will only be present for ship observations reporting in delayed mode.
During the initial conversion of the input data sources, the station/platform identifier is checked against a list of those stations known to produce low quality data (see Table 4)). Observations with a station/-platform identifier matching one on the list or from a source or location known to be of low quality have their quality flags set to 6. Those observations are excluded from the advanced quality control (see Chapter 3.4).
After the initial conversion of the input data sources, the subsequent processing is performed in the CDM framework. The individual processing steps are handled with the GLAMOD marine processing package (Lierhammer et al. [2025a]). The quality of the data is checked, using the marine quality control (marine_qc) software package (Lierhammer et al. [2025b]).
Table 3: Sections of an IMMA record (see https://gdex.ucar.edu/datasets/d548000/documentation for a full description)
Attachment name | Abbreviation | Description |
Core | Core | ICOADS core record containing commonly observed parameters, date and time |
ICOADS attachment | ICOADS | Attachment containing ship/ platform identification, additional observed parameters and ICOADS QC flags |
IMMT-5 / FM 13 attachment | IMMT | Attachment containing additional information reported by Voluntary Observing Ships in either the International Marine Meteorological Tape (IMMT) or WMO FM-13 formats. |
ModelQCattachment | Mod-QC | Attachment containing NWP model data collocated to the location the observations. |
Shipmetadataattachment | Meta-VOS | Attachment containing selected metadata from WMO-No. 47 merged with ship observations in ICOADS. |
Nearsurface oceanographic data attachment | NOcn | Attachment containing surface ocean biogeochemical measurements |
Edited cloud report | ECR | Attachment containing corrected / edited cloud observations following Hahn and Warren (1999). This is only available from 1950 onwards. |
Reanalysis QC / feedback attachment | Rean-QC | Attachment providing the facility to include observation feedback information from reanalysis models |
IVAD attachment | IVAD | Attachment to store data from the ICOADS Value Added Database Project |
Error attachment | Error | Attachment designed to support the correction of erroneous IMMA elements. |
Unique report ID attachment | UIDA | Attachment with unique ID assigned to each report |
Supplemental data attachment | Suppl. | Attachment to store additional data not representable in other attachments and to store the weather reports in the orignal format / units where available. The format of the Suppl. attachment is source and deck dependent. |
Table 4: Conditions for exclusion
Condition | Reason / details |
Report at 0°N, 0°E | Often when the location is missing the latitude and longitude are set to 0°. |
ICOADS platform type 13 | Data from C-MAN coastal station. Data not representative of the open ocean. |
Data from the ICOADS SEAS deck (deck 874) | There was an unrecoverable error in the encoding of data from this source into IMMA. |
Mis-calibrated buoys | Data from buoys 52521, 53522, 53566-68, 53571, 53578, 53580, 53582, 53591-96, 53599, 53600-09, 53901, 53902 are known to have calibration errors during the period November 2005 to January 2006. |
Mis-positioned MORMET data (deck 732) | Observations from the Russian Marine Meteorological Dataset (MORMET) are known to be mis-positioned in certain regions and time periods. These are listed in Table 5. |
Table 5: Exclusion regions and periods the ICOADS MORMET deck (deck 732)
Region | Bounds |
Years (inclusive) | |||
W | S | E | N | ||
1 | -175 | 40 | -170 | 55 | 1958 - 1971 |
2 | -165 | 40 | -160 | 60 | 1958 - 1971 |
3 | -145 | 40 | -140 | 50 | 1958 - 1964, 1968 - 1971 |
4 | -140 | 30 | -135 | 40 | 1958 - 1959, 1969 - 1974 |
5 | -140 | 50 | -130 | 55 | 1958 - 1964, 1967 - 1971 |
6 | -70 | 35 | -60 | 40 | 1958 - 1961, 1963 - 1971 |
7 | -50 | 45 | -40 | 50 | 1969, 1971 - 1974 |
8 | 5 | 70 | 10 | 80 | 1969 - 1974 |
9 | 0 | -10 | 10 | 0 | 1960, 1966 - 1972 |
10 | -30 | -25 | -25 | -20 | 1965, 196 |
C-RAID data is stored in NetCDF files that are organized as separate files for each buoy track (Rannou et al. [2023]). Each of these track files is decoded and the contents are mapped into the CDM format. A unique identifier is assigned to each observation consisting of the C-RAID track number and the sequential number of the observation in the data file. This ensures the traceability of the observations back to the original source file. In an intermediate preprocessing step after the initial conversion, the individual track files are reorganized and aggregated to monthly data files. These files contain all available buoy observations of the specific month. Furthermore, observations that are marked as invalid (time stamp, position) or do not contain any observations in the C-RAID source data set are dropped from further processing in order to reduce the processing overhead.
The station/platform identification within ICOADS is stored in the ID field and over the lifetime of the marine meteorological observations and archives a wide variety of formats have been used to identify individual ships. For example, in the early punchcard decks from the 1960s and earlier, ships were assigned a numeric identifier allocated in alphabetical order each year with the ship names written on the back of the punchcard. Lists matching the ship numbers to ship names were created and archived. However, over time and with the conversion from punchcards to magnetic tape and then to digital formats the link between ships names and numbers has been lost, with only the ship number remaining. For some decks the ship number has subsequently been dropped where it was thought to have no value without the link to the ship name. In other cases the logbook number was used to identify the digitised data, sometimes with a page number appended. More recently, ship names or abbreviated ship names have been recorded directly in the digitised data when logbooks have been keyed. For realtime data sources we typically have the International Telecommunication Union (ITU) callsign or a masked/generic identifier. With the introduction of WIGOS (WMO Integrated Global Observing System) Station Identifiers (WSI), new identifiers for marine platforms are issued through OceanOps. For ships, the last segment of the WSI consists of the SOT-ID which is compatible with the traditional callsign fields.
Overall, there are many sources of error and ambiguity in the station/platform identification field. These issues are discussed further in Carella et al. [2017] and we follow the method of Carella et al. [2017] to link observations from the same ship together. As part of this process the ID field has been corrected for known errors and new ids assigned where the ID field is missing, a generic ID assigned or set to a logbook and page number.
Due to the different data management practices and data streams for drifting buoy data compared to ship data, many of the problems encountered, particularly with historical ship data sets, are avoided. The prevalence of mis-coding errors in the date, time, location and buoy ID fields is thought to be negligible. In the C-RAID data set the WMO station number is missing in some cases and marked in GLAMOD as ”UNKNOWN”. It appears that some of the missing ids may be recovered by further comparison with GTS data sources.
Observations in GLAMOD originate from different sources and data streams. Due to multiple copies of historic data or parallel ingestions in real-time data streams, duplicates of observations occur. The duplicate detection in near real-time data and historic data streams of ships and buoys are handled with different routines that are described below. Table 6 list the duplicate flags (element ”duplicate_status”) used in the CDM. Additionally, an array of report ids of the duplicates is stored in the ”duplicates” element.
Table 6: Duplicate flags and meanings used within the common data model
Flag value | Meaning | Comments |
0 | Unique observation, no known duplicates | |
1 | Best duplicate | |
2 | Duplicate | This has been used to identify reports thought to be duplicates but where no selection was made, e.g. because there were unique variables present in each report. |
3 | Worst duplicate | |
4 | Unchecked | Reports have not been through the duplicate identification process. |
Duplicates exist in ICOADS due to the way that the dataset has been created over many decades with data from many different sources, often with overlapping observations, merged. For example, many national climate archives such as those in the US, UK, Germany and the Netherlands were based on the same source data with the ship logbooks digitised once and then shared between nations and in some cases may have even been redigitised. This process has been repeated several times resulting in many duplicates in the source data. For recent decades multiple real-time GTS sources have been ingested as this has been shown to increase the number of unique reports but at the expense of increased duplicates. Similarly, when the higher quality delayed mode versions of the real-time data are ingested the real-time versions need to be identified, flagged and replaced. As part of the processing to create ICOADS the duplicate detection and elimination software (DUPELIM) is run, with only those reports believed to be unique released into the final versions (e.g. see Freeman et al. [2017], Woodruff et al. [2011]). The processing for C3S2 uses the “total” ICOADS files, which include observations identified as duplicates that have been excluded from the “final” data.
The ICOADS DUPELIM software relies heavily on the date, time, location and station/platform identification. All of these fields may contain errors due to a number of reasons. For example, an observation may be mis-assigned to the wrong day or an observation put in the wrong location due to the miscoding of the quadrant of the globe within the data records when the data were originally keyed. In other cases, the data have been modified or stored differently in the different archives. For example, the location of an observation was often recorded as the one degree latitude/longitude grid box the observation was located in. On conversion to a latitude/longitude pair the location could be given either as the edge of the grid box or the grid box centre, resulting in different locations for the same observation. Conversion between units and the enumeration/translation between code tables can also introduce minor differences. As such, due to those differences not all duplicates have been identified by the DUPELIM software and unidentified duplicates still exist in the ICOADS final files.
Within the C3S2 service an additional level of duplicate detection has been performed making allowances for these known issues. In many cases the results are similar to that of DUPELIM but with an increase of up to 5 - 10 % in the number of duplicates identified in some periods. The method and results are fully described in Kent et al. [2019], Table 6 lists the categorisations used.
For near-real-time data from 2015 on (ship data from ICOADS Release 3.0.2 and C-RAID drifting buoys), a new duplicate detection scheme has been implemented for GLAMOD Release 7. This procedure is more lightweight than the procedure for historic data, as we assume to have less divergence due to coding or copy errors in position, time and platform id information in recent real-time data streams.
The duplicate detection utilizes the Python Record Linkage Toolkit (de Bruin et al. [2023]) which provides efficient indexing methods as well as functions to compare records and classifiers. With the current implementation we compare observation date, time, location and primary_station_id, allowing a window of 1 minute in time and 0.1 degree for lat/lon with a gaussian decay. Additionally, station course and speed are compared, if available. A penalty is assigned with increasing deviation between two observations and the comparison results are normalized and scored. Scores with more than 99.1 percent similarity are treated as duplicates. A two step approach was implemented trying to eliminate duplicates with generic or masked primary station IDs in favour of observations with a valid station ID in the first step and then searching for all other duplicates in the second step. C-RAID buoy data is handled slightly differently, by ignoring the primary station id to identify duplicates with presumably incorrect station IDs. Additionally, the comparison windows for drifting buoys are configured with narrower limits for time and latitude/longitude to account for the much higher observation frequency and slower movement of the buoys.
The report quality of an observation that is neither the best duplicate nor a unique report is set to 1.
Once the observations have been through the initial format checks, station/platform identification and duplicate flagging a more advanced quality control is performed. The advanced quality control is mainly based on a scheme developed by the UK Met Office, described in Kennedy et al. [2019] and repeated here for completeness. The advanced quality control is not performed for reports that are initially set on an exclusion list (see 3.1) or that are already flagged as failed. For a ship based report to proceed to the advanced quality control stage it needs to have been flagged as either the best duplicate or a unique report. All other observations have their report quality flags set to failed and their observation quality flags set to unchecked. No additional quality control will be performed for these reports. For C-RAID data, no additional quality control is performed as the quality flags already set will continue to be used.
For more information about the marine quality control package see marine_qc documentation.
The first stage of the advanced quality control is performed on the reports and consists of two separate quality control checks:
report level quality control: checks performed for each individual report. The checks are performed on 1 month of data at a timereport along track/voyage quality control: checks performed for each station/platform with a non-generic identifier. The checks are performed on 1 month of data at a time but include the data for the preceding and following months to ensure that reports at the start and end of the month are treated consistently.The report level quality control checks are similar to those performed as part of the verification of input data (section 3.1). Three different flags are updated based on the validity of the date, time and location of a report. The flags are:
location_quality: Good locations are flagged as 0, bad locations are flagged as 2 (see location_quality_flags)report_time_quality: Good timestamps are flagged as 2, timestamps with missing date information are flagged as 4 and invalid timestamps are flagged as 5 (see time_quality_flags).report_quality: Good data is flagged as 0, bad data is flagged as 1 (see quality_flags). Flag data as bad data for bad locations (location_quality is 2) and/or bad timestamps (report_time_quality is 4 or 5).Data that report_quality is flagged as bad will be excluded from further advanced quality checks.
The report along track/voyage quality control checks are further checks to check whether location_quality is valid along tracks of single stations/platforms. Only stations/platforms with a non-generic identifier are subject to the along track quality control full details are provided in Kennedy et al. [2019]. Therefore, two separate track checks will be performed.
Data with a bad location is flagged as bad and excluded from further advanced quality control. Any quality_flags of observations belonging to badly flagged reports will therefore remain flagged as unchecked.
After performing advanced quality control checks on reports, the next stage of the quality control is to check the single observations and consists of three separate quality control checks.
observation level quality control: checks performed for each individual observation.The checks are performed on 1 month of data at a time.observation along track/voyage quality control: additional checks performed for each station/platform with a non-generic identifier. The checks are performed on 1 month of data at a time but include the data for the preceding and following months.observation buddy quality control: checks performed on a group of observations, potentially comprising reports from many platforms and platform types. The observations can cover large areas and multiple months. The tests currently include so-called “buddy” checks in which the values for each report are compared to those of their neighbours. The checks are performed on 1 month of data at a time but include both the data for the preceding and following months and additional buoy data for the preceding, current and following months.As mentioned above, quality_flags of observations belonging to reports already flagged as bad will remain flagged as unchecked.
The following checks are performed for each individual observation.
Observations with missing data are flagged as 3 (Missing) and will be excluded from further advanced quality control checks.
For each observation table different checks will be performed. If one check fails no further checks will be performed for this observation table.
observations-at:observations-dpt:observations-slp:observations-sst:observations-wd:observations.ws:Afterwards, checks will be performed that combine two observation tables.
Observations that fail this check are excluded from further advanced quality control checks.
Only stations/platforms with a non-generic identifier are subject to the along track quality control, full details are provided in Kennedy et al. [2019]. For sea surface temperature only, a spike check as described by Xu and Ignatov [2014] but only using the 5 observations immediately preceding and following the observation being checked (do_spike_check).
Observations that fail this check are excluded from further advanced quality control checks.
The final check applied to the ship observations is the buddy checks. In addition to the ship data, good C-RAID drifting buoy data will be added to increase the number of buddys. The marine_qc provides two kinds of buddy checks:
Table 7: Limits for MDS buddy checks (AT, DPT and SST). Note that 111 km corresponds to 1 degree of longitude at the equator.
Search area | Number of neighbouring grid boxes | Range |
± 2 pentads (10 days) and ± ∼111 km | n > 100 15 < n < 100 5 < n < 150 0 <= n <= 5 | µ ± 2.5 σ µ ± 3.0 σ µ ± 3.5 σ µ ± 4.0 σ |
± 2 pentads (10 days) and ± ∼222 km | n > 0 | µ ± 4.0 σ |
± 4 pentads (20 days) and ± ∼111 km | n > 100 15 < n < 100 5 < n < 150 0 <= n <= 5 n = 0 | µ ± 2.5 σ µ ± 3.0 σ µ ± 3.5 σ µ ± 4.0 σ NA (± Inf) |
± 4 pentads (20 days) and ± ∼222 km | n > 0 | µ ± 4.0 σ |
Table 8: Limits for MDS buddy checks (SLP).
Search area | Number of neighbouring grid boxes | Range |
± ∼111 km | n > 100 15 < n < 100 5 < n < 150 0 <= n <= 5 | µ ± 2.5 σ µ ± 3.0 σ µ ± 3.5 σ µ ± 4.0 σ |
± ∼222 km | n > 0 | µ ± 4.0 σ |
± ∼333km | n > 100 15 < n < 100 5 < n < 150 0 <= n <= 5 n = 0 | µ ± 2.5 σ µ ± 3.0 σ µ ± 3.5 σ µ ± 4.0 σ NA (± Inf) |
± ∼444km | n > 0 | µ ± 4.0 σ |
Table 9: Parameters for bayesian buddy checks.
prior_probability_of_gross_error quantization_interval one_sigma_measurement_uncertainty limits noise_scaling maximum_anomaly fail_probability | 0.05 0.1 1.0 [2, 2, 4] 3.0 8.0 0.3 |
Figure 18 shows the outcome of the duplicate flagging. Prior to 1950 the majority of data observations are unique but with a small number flagged as either a best duplicate or worst duplicate. It was not possible to differentiate between best and worst duplicates for a small number of reports. In these cases, both observations are marked as duplicates with no qualification as best or worst. The impact of widespread data sharing can be seen in the 1950s and early 1960s with a significant number of best and worst duplicates. Between 1965 and the late 1970s the number of duplicates decreases again. Around 2000 the number of duplicates increased significantly. The rise in duplicates during this period is due to the collation of multiple real time data sources and delayed mode data into ICOADS.
Figure 19 shows the impact of the overall report level QC on the data available through the current data release. In Addition to the duplicate check, this flags includes date, location and track checks (see ”Report quality control” in section 3.4 for details).
The majority of failed reports are due to duplicates as shown in Figure 18. An exception is the period around the second world war, where 15 - 25% of the records fail the report level QC. The reason for this is to be investigated until the next release. The impact of the observation level quality control for the individual parameters is not shown here. Typically 5 - 10% of the observations are flagged as failed.

Figure 18: Percentage of reports flagged as unique, best duplicate, duplicate, worst duplicate or unchecked in the current data release.

Figure 19: Percentage of reports flagged as unchecked, passing or failing the overall report quality check in the current data release.
The marine data builds upon several decades of effort under ICOADS. Since its outset ICOADS has been served under a fully open licence by NCAR1 and NOAA/NCEI2 . Any business built upon ICOADS, e.g. a derivation of the data to provide guidance to marine shipping companies, is perfectly acceptable and encouraged. Further, there are no limits to re-use and third party data sharing. Source decks which did not meet these restrictions were not ingested into ICOADS. There are no ’hidden’ holdings of more restricted IPR available within ICOADS. There will, however, be marine data (e.g. from research vessels, military, coastguard, bio-monitoring) that have not been prioritised for accession for a wide variety of reasons (including resources, quality, volume, range of parameters, complexity and perhaps uncertain IPR).
The C-RAID project is funded by Copernicus through a contract with the European Environment Agency. Contract # EEA/IDM/15/026/LOT1 (For Services supporting the European Environment Agency’s (EEA) implementation of cross-cutting activities for coordination of the in-situ component of the Copernicus Programme Services). The C-RAID data set Zunino Rodriguez et al. [2025] is distributed under the Creative Commons Attribution 4.0 International (CC BY 4.0) license.
Berry, D., Kent, E., and Taylor, P. (2004). An analytical model of heating errors in marine air temperatures from ships. Journal of Atmospheric And Oceanic Technology, 21(8), 1198–1215. doi:10.1175/1520-0426(2004)021%3C1198:AAMOHE%3E2.0.CO;2
Berry, D. I., and Kent E. C. (2011). Air-Sea fluxes from ICOADS: the construction of a new gridded dataset with uncertainty estimates. International Journal of Climatology, 31(7, SI), 987–1001. doi:10.1002/joc.20259
Bojinski, S., Verstraete, M., Peterson, T. C., Richter, C., Simmons A., and Zemp, M. (2014). The Concept of Essential Climate Variables in Support of Climate Research, Applications, And Policy. Bulletin of The American Meteorological Society, 95(9), 1431–1443. doi:10.1175/BAMS-D-13-00047.1
Businger, J., Wyngaard, J., Izumi, Y., and Bradley, E. (1971). Flux-Profile Relationships in Atmospheric Surface Layer, Journal of The Atmospheric Sciences, 28(2), 181–189. doi:10.1175/1520-0469(1971)028%3C0181:FPRITA%3E2.0.CO;2
Carella, G., Kent, E. C., and Berry, D. I., (2017). A probabilistic approach to ship voyage reconstruction in ICOADS. International Journal of Climatology, 37(5), 2233–2247. doi:10.1002/joc.4492
Chan, D., Kent, E. C., Berry, D. I., and Huybers, P. (2019). Correcting datasets leads to more homogeneous earlytwentieth-century sea surface warming. Nature, 571(7765), 393-397, doi:10.1038/s41586-019-1349-2
Chenoweth, M. (2000). A new methodology for homogenization of 19th century marine air temperature data. Journal of Geophysical Research-Atmospheres, 105(D23), 29,145–29,154. doi:10.1029/2000JD900050
Cornes, R. C., Kent, E. C., Berry, D. I., and Kennedy, J. J. (2020). CLASSnmat: A global night marine air temperature data set, 1880-2019. Geoscience Data Journal, 7(2), 170–184. doi:10.1002/gdj3.100
de Bruin, J., et al. (2023). J535d165/recordlinkage: v0.16. doi:10.5281/zenodo.8169000
Freeman, E., et al. (2017). ICOADS Release 3.0: a major update to the historical marine climate record. International Journal of Climatology, 37(5), 2211–2232. doi:10.1002/joc.4775
Kennedy, J. J. (2014). A review of uncertainty in in situ measurements and data sets of sea surface temperature. Reviews of Geophysics, 52(1), 1–32. doi:10.1002/2013RG000434
Kennedy, J. J., Rayner, N. A., Atkinson, C. P., and Killick, R. E. (2019). An ensemble data set of sea surface temperature change from 1850: The Met Office Hadley Centre HadSST.4.0.0.0 Data Set. Journal of Geophysical Research: Atmospheres, 124(14), 7719–7763. doi:10.1029/2018JD029867
Kennedy, J. J., Rayner N. A., Smith, R. O., Parker, D. E., and Saunby, M. (2011). Reassessing biases and other uncertainties in sea surface temperature observations measured in situ since 1850: 2. Biases and homogenization. Journal of Geophysical Research-Atmospheres, 116. doi:/10.1029/2010jd015220
Kent, E., and Taylor, P. (1997). Choice of a Beaufort equivalent scale. Journal of Atmospheric And Oceanic Technology, 14(2), 228–242. doi:10.1175/1520-0426(1997)014%3C0228:COABES%3E2.0.CO;2
Kent, E. C., and Kennedy, J. J. (2021). Historical estimates of surface marine temperatures. Annual Review of Marine Science, 13(1), 283–311, doi:10.1146/annurev-marine-042120-111807
Kent, E. C., Berry, D. I., González, I. P., Cornes, R., and Kennedy, J. (2019). Documentation for marine duplicate identification and linking of platform identifiers. Tech. rep., National Oceanography Centre
Kent, E. C., et al. (2017). A Call For New Approaches to Quantifying Biases in Observations of Sea Surface Temperature. Bulletin of The American Meteorological Society, 98(8), 1601–1616. doi:10.1175/BAMS-D-15-00251.1
Kent, E. C., Woodruff, S. D., and Berry, D. I. (2007). Metadata from WMO Publication No. 47 and an Assessment of Voluntary Observing Ship Observation Heights in ICOADS. Journal of Atmospheric And Oceanic Technology, 24(2), 214–234. doi:10.1175/JTECH1949.1
Kent, E. C., Rayner, N. A., Berry, D. I., Saunby, M., Moat, B. I., Kennedy, J. J., and Parker, D. E. (2013). Global analysis of night marine air temperature and its uncertainty since 1880: The HadNMAT2 data set. Journal of Geophysical Research-Atmospheres, 118(3), 1281–1298. doi:10.1002/jgrd.50152
Lierhammer, L., Andersson, A., Leiding, T., Cornes, R., Kent, E., Siddons, J., and Kennedy, J. (2025a). glamod-marine-processing: Toolbox for GLAMOD marine processing (v8.0.0). doi:10.5281/zenodo.17404810
Lierhammer, L., Kennedy, J., Leiding, T., Willett, K., Atkinson, C., Cornes, R., Kent, E., Smith, T. J., and Andersson A. (2025b). marine_qc (v0.2.0). doi:10.5281/zenodo.17404319
Lierhammer, L., Siddons, J., Willruth, J. M., Andersson, A., Leiding, T., Cornes, R., and Kent, E. (2025c). cdm_reader_mapper: Common Data Model reader and mapper toolbox (v2.1.1). doi:10.5281/zenodo.17403676
Liu, C., et al. (2022). Blending TAC and BUFR marine in situ data for ICOADS near-real-time release 3.0.2. Journal of Atmospheric and Oceanic Technology. doi:10.1175/JTECH-D-21-0182.1
Moat, B., Yelland, M., Pascal, R., and Molland, A. (2005). An overview of the airflow distortion at anemometer sites on ships. International Journal of Climatology, 25(7), 997–1006. doi:10.1002/joc.1177
Rannou, J. P., Zunino Rodriguez, P., and Hamon, M. (2023). C-RAID drifters NetCDF format reference manual - NetCDF conventions and Reference Tables. doi:10.13155/81638
Rannou, J. P., Hamon, M., and Zunino Rodriguez, P. (2024). C-RAID spring 2024 delivery – activity report. doi:10.13155/92124
Rayner, N., Parker, D., Horton, E., Folland, C., Alexander, L., Rowell, D., Kent, E., and Kaplan, A. (2003). Global analyses of sea surface temperature, sea ice, and night marine air temperature since the late nineteenth century. Journal of Geophysical Research-Atmospheres, 108(D14). doi:10.1029/2002JD002670
Sippel, S., et al. (2024). Early-twentieth-century cold bias in ocean surface temperature observations. Nature, 635(8039), 618–624. doi:10.1038/s41586-024-08230-1
Smith, S. R., et al. (2019). Ship-Based Contributions to Global Ocean, Weather, and Climate Observing Systems. Frontiers in Marine Science, 6. doi:10.3389/fmars.2019.00434
Thomas, B. R., Kent, E. C., Swail, V. R., and Berry, D. I. (2008). Trends in ship wind speeds adjusted for observation method and height. International Journal of Climatology, 28(6), 747–763. doi:10.1002/joc.1570
Willett, K. M., Jones, P. D., Gillett, N. P., and Thorne, P. W. (2008). Recent Changes in Surface Humidity: Development of the HadCRUH Dataset. Journal of Climate, 21(20), 5364–5383. doi:10.1175/2008JCLI2274.1
Woodruff, S., Slutz, R., Jenne, R., and Steurer, P. (1987). A Comprehensive Ocean-Atmosphere Data Set. Bulletin of The American Meteorological Society, 68(10), 1239–1250. doi:10.1175/1520-0477(1987)068%3C1239:ACOADS%3E2.0.CO;2
Woodruff, S. D., et al. (2011). ICOADS Release 2.5: extensions and enhancements to the surface marine meteorological archive. International Journal of Climatology, 31(7, SI), 951–967. doi:10.1002/joc.2103
Worley, S., Woodruff, S., Reynolds, R., Lubker, S., and Lott, N. (2005). ICOADS release 2.1 data and products. International Journal of Climatology, 25(7), 823–842. doi:10.1002/joc.1166
Xu, F., and Ignatov, A. (2014). In situ SST Quality Monitor (iQuam). Journal of Atmospheric And Oceanic Technology, 31(1), 164–180. doi:10.1175/JTECH-D-13-00121.1
Zunino Rodriguez, P., Rannou, J. P., Poli, P., Blanc, F., Carval, T., Billon, C., and Hamon, M. (2025). C-RAID improve the access to historical drifter data: Copernicus Reprocessing of Argos and Iridium Drifters (C-RAID). doi:10.17882/77184
Table A. Table of data sources, with the acknowledgement recommended to be used.
| Dataset release | CDS dataset version | source_id | Source and acknowledgement to use |
|---|---|---|---|
| R8 | 2.0.0 | 1 | ICOADS_R3.0.0T: Freeman, E., Woodruff, S. D., Worley, S. J., Lubker, S. J., Kent, E. C., Angel, W. E., Berry, D. I., Brohan, P., Eastman, R., Gates, L., Gloeden, W., Ji, Z., Lawrimore, J., Rayner, N. A., Rosenhagen, G. and Smith, S. R. (2017). ‘ICOADS Release 3.0: a major update to the historical marine climate record’. International Journal of Climatology, 37 (5), pp. 2211–2232. https://doi.org/10.1002/joc.4775 |
| R8 | 2.0.0 | 2 | ICOADS_R3.0.2T: Liu, C., Freeman, E., Kent, E. C., Berry, D. I., Worley, S. J., Smith, S. R., Huang, B., Zhang, H., Cram, T., Ji, Z., Ouellet, M., Gaboury, I., Oliva, F., Andersson, A., Angel, W. E., Sallis, A. R. and Adeyeye, A. (2022). ‘Blending TAC and BUFR Marine In Situ Data for ICOADS Near-Real-Time Release 3.0.2’. Journal of Atmospheric and Oceanic Technology, 39 (12), pp. 1943–1959. https://doi.org/10.1175/JTECH-D-21-0182.1 |
| R8 | 2.0.0 | 3 | C-RAID_1.2 (202412): Zunino Rodriguez, P., Rannou, J. P., Poli, P., Blanc, F., Carval, T., Billon, C. and Hamon, M. (2025). ‘C-RAID improve the access to historical drifter data: Copernicus Reprocessing of Argos and Iridium Drifters (C-RAID)’. SEANOE. https://doi.org/10.17882/77184 |
This document has been produced in the context of the Copernicus Climate Change Service (C3S). The activities leading to these results have been contracted by the European Centre for Medium-Range Weather Forecasts, operator of C3S on behalf of the European Union (Delegation Agreement signed on 11/11/2014 and Contribution Agreement signed on 22/07/2021). All information in this document is provided "as is" and no guarantee or warranty is given that the information is fit for any particular purpose. The users thereof use the information at their sole risk and liability. For the avoidance of all doubt , the European Commission and the European Centre for Medium - Range Weather Forecasts have no liability in respect of this document, which is merely representing the author's view. |
Related articles appear here based on the labels you select. Click to edit the macro and add or change labels.