Table of Contents

1. Introduction

The Copernicus Climate Change Service (C3S) has been set up to provide the European community with climate monitoring products and climate services. Important elements of this service portfolio are reanalysis products, and the various generations of the global "ERA" atmospheric reanalysis products of ECMWF, where the most recent version is ERA5, has seen widespread use for a broad range of applications. In C3S it was also decided to develop and produce an Arctic regional reanalysis, which can add value to and complement the global reanalysis products by adding more detail with a Numerical Weather Prediction (NWP) model system well adapted for use in the Arctic. With climate warming going on more rapidly in the Arctic than lower latitudes, this is of broad interest for describing and understanding Arctic weather and climatology as well as the details in the ongoing change processes in the region.

C3S contract 322 Lot 2 was aimed to set up and run a regional reanalysis for the European Arctic region at a 2.5 km grid resolution, covering a 24-years period between July 1997 and June 2021 (later the dataset had been extended to cover the period from 1991 to present). For convenience we use in the rest of this document the acronym CARRA (Copernicus Arctic Regional ReAnalysis) to refer to this NWP code system and also to the dataset produced.

The CARRA reanalysis system is based on the mesoscale Numerical Weather Prediction (NWP) system HARMONIE-AROME (Bengtsson et al, 2017), with a series of adaptation and extensions. HARMONIE-AROME is the common basis behind the operational Numerical Weather Prediction (NWP) system at the Nordic weather services, all of them contributors to this project.

The reanalysis is performed on two domains covering Greenland and Iceland on the west side (CARRA-West), and Svalbard and Northern Scandinavia on the east side (CARRA-East). The CARRA-West coincides with the one currently used by the operational weather forecasting system at the Danish Meteorological Institute (DMI) and the Icelandic Meteorological Office (IMO). The CARRA-East is similar to the NWP domain used by the Norwegian Meteorological Institute (MET) for operational Arctic forecasting, but slightly extended in area. The West and East domains overlap over regions around Svalbard and Jan Mayen.


Figure 1: The two reanalysis domains CARRA-West and CARRA-East.

In this report, we document the main features of the CARRA reanalysis system. We first describe briefly the baseline system with HARMONIE-AROME 40h1.1.1, followed by listing and describing deviations and extensions developed for the reanalysis application. In an Appendix, we provide a summary of the sources for different types of input. In a separate test and evaluation report, we provide evaluation and conclusions from tests of various features in our system configuration, especially those with meteorological significance. We also provide a user guidance document describing the output data and their access in the Copernicus Climate Data Store (CDS).

2. Main characteristics of the CARRA system

2.1. CARRA system baseline

A basic requirement for a reanalysis dataset is that it is performed with a fixed data assimilation and forecast system and an input data set which is as consistent and as comprehensive as possible. For CARRA, the HARMONIE-AROME system, which is used in research and operational application in the international NWP consortium HIRLAM, is used as system basis. HARMONIE-AROME is a kilometer-scale, non-hydrostatic model for limited area application. The system has been developed with primary goal of operational weather forecasting with intermittent data assimilation. HARMONIE-AROME is based on the shared ALADIN-HIRLAM forecast code framework, and is also part of the IFS (Integrated Forecasting System) code framework shared with the ECMWF NWP system (which includes the ERA5 reanalysis system). A detailed description of the HARMONIE-AROME model can be found in Bengtsson et al (2017). Operational implementation of HARMONIE-AROME include configurations for the Arctic regions at Norwegian national meteorological services (Müller et al, 2017b) and at the Danish Meteorological Institute and Icelandic Meteorological Office (Yang et al 2018). For this reanalysis, CARRA, the latest system code 40h1.1.1 is used as baseline, which includes, as default configuration, a 3DVAR data assimilation for upper air, a surface analysis with Optimum Interpolation (OI) scheme for surface and soil parameters, a forecast model including parameterization of surface processes and a set of system scripts for intermittent data assimilation cycling.

The CARRA reanalysis runs with a 3-hourly intermittent data assimilation, making 30-h forecast integration at analysis time of 00 and 12 UTC, and 3-h integration for other base times. For time integrated quantities such as precipitation and surface fluxes, accumulation between +6 and +18 h forecast are possible and recommended for use in order to take into account concerns on moisture spin-up. For instantaneous properties, analyses and diagnoses around base times are used as main reanalysis products.

For application in the CARRA reanalysis, a variety of adaptations and improvements have been necessary to treat the special circumstances which deviate from the usual scenario typical in real time operational forecasting applications. As baseline in observation data usage, the same data sets covering the entire CARRA reanalysis period as used by the Copernicus global reanalysis project ERA5 are used. Furthermore, changes on the code interface were necessary to use ERA5 hourly global reanalysis as lateral boundaries. In addition, to optimize the quality of CARRA, numerous improvements on model and assimilation algorithm have been implemented after extensive tests. Changes have also been made to take on board data from local meteorological service conventional observation archive collections, the re-processed remote sensing data including ocean and sea ice, improved datasets on orography, physiographic database (PGD) and gridded glacier albedo data. These changes to the system are described in the following sections.

2.2. Extensions and developments

2.2.1. System harmonization and efficiency tuning

The CARRA reanalysis production is performed on the High Performance Computation Facility (HPCF) at the European Center for Medium-range Weather Forecasts (ECMWF), which includes use of Meteorological Archival and Retrieval System (MARS) and the ECMWF File Storage (ECFS) system. In order to ensure continuous and efficient production for the 24-year reanalysis on both domains, various technical adaptations have been made with the HARMONIE-AROME scripting system to optimize preparation of input data, job scheduling suites under the ECFLOW tool, post-processing of analysis and model fields and archiving.

Harmonization

The CARRA reanalysis builds on the experience from the two operational Arctic NWP suites in use for operational NWP at DMI/IMO (West domain, Yang et al 2018) and MET Norway (East domain, Müller et al, 2017). Although both of these operational setups are based on HARMONIE 40h1 (Bengtsson et al, 2017), there exist differences in configuration options, most of which are justified by the operational experiences and needs. These include the use of different spectral grids, physiographic database and handling of glacier ice, the options involving mixing phase microphysics, approaches to derive structure functions, sources of observation data, assimilation of snow observation reports, microphysics options etc. For this reanalysis, all of the differences in configurations have been identified, with harmonization efforts applied accordingly with the final outcome to obtain a unified and optimal setup, to ensure consistency between the datasets for both domains.

HARMONIE-AROME allows use of spectral grids such as cubic or quadratic, both of which apply additional spectral truncation in the dynamic part of computation and thereby reduce significantly computational cost. For the operational IGB configuration in DMI/IMO, a cubic grid has been used, whereas in the operational setup for the Svalbard and northern Scandinavia domain (AROME-Arctic, a subset of the CARRA-West domain), a linear grid is used. Taking into account both verification statistics and computation costs, quadratic grid is selected for both of the CARRA domains.

Adaptation and harmonizing efforts in terms of source code changes and use of data sets have also been necessary concerning orographic and physiographic databases used for both domains, as well as the code and script changes to enable reading of glacier snow albedo data, the external tree height data for Scandinavia. More details are in the following sections.

Efficiency tuning
Since the CARRA production is carried out at the ECMWF HPC, various adaptations to that hardware architecture and software environment have been made to improve throughput and to enable sufficient production speed in order to ensure a timely delivery of the reanalysis.

For the cycling data flow, various changes have been implemented to separate the steps in input data preparation (such as those for lateral boundary preparation and pre-processing of observation data ("MakeCycleInput" job family in the HARMONIE-AROME script), and in the analysis and forecast steps ("Date" job family) , so that the former, which is less computationally intensive but include interaction with storage devices MARS and ECFS, can be made well in advance of the computational intensive steps so as not to be in the critical cycling path. As a long integration (up to 30h) is only run at 00 and 12 UTC each day, whereas analysis and forecast for other base times in 3h interval are only for cycling purpose with minimal forecast length, changes have been made to allow triggering of intermediate cycles immediately after the 3h forecast in done at 00 and 12 UTC, so that assimilation and short range integration can be made in parallel while the longer integration is still ongoing.

The "Obsmon_stat" jobs, which generate assimilation monitoring statistics, read data from the Observation Data Base (ODB) as generated during the assimilation step, compute statistics and write out relevant data in Sqlite3 format ("db" files). Originally one job for each observation and ODB type (ecma and ccma) is submitted to the fractional job queue (nf) at ECMWF HPCF, which turns out to be very inefficient, with many jobs wait in the queue for up to an hour. To solve this, the Obsmon_stat jobs are reorganized so that all jobs associated with a specific ODB type are grouped together and submitted as one job on the parallel nodes (np). This reduces queueing time to minimal, so that the production streams become significantly more stable.

At the ECMWF HPCF, the "ODB_IO_METHOD" setting value of 1 (as in the reference 40h1.1.1 setup) has been found to cause a major bottleneck in the screening task. The setting creates numerous small files in thousands, for each type of ODB (odb, odbvar, odb_ccma). This endangers the "inode" quota (number of files) on the disks of the HPC node cca/ccb and also slow down the IO intensive screening task to take up to half an hour, in contrast to the normal throughput of a few minutes on the operational HPC platforms typically seen in DMI and MET computers. A solution has been found to set ODB_IO_METHOD to 4, which creates much less files on disk, as the distribution of ODB data is handled by MPI communication instead. To enable this, changes on the script and FORTRAN source level, which are backported from the HAROMIE-AROME version 43, have been necessary. With this modification, the elapsed time for screening tasks reduced consistently to around 2 minutes and the throughput increased to the targeted levels.

2.2.2. Adaptation and improvement in data assimilation algorithm

Large scale error constraint

In order to run the CARRA regional re-analyses, lateral boundary conditions provided by the global ERA5 reanalysis are used. In contrast to a limited area model (LAM) which is preferable for representing meso- and local scales, global models are superior in representing large-scale, for example Rossby waves with length scales of 1000 km or more. An accurate representation of the large scale is important as it provides atmospheric forcing for convective scale phenomena. The advantage with global models on large scale is both due to a more efficient use of satellite observations in global systems and a more advanced data assimilation method at ECMWF. In addition, an LAM with relatively small domain size has a genuine inability to treat long waves properly. From these considerations, it is important for a LAM system to ensure an adequate coupling with the host model so as to take full advantage of large scale information from the latter.

The HARMONIE forecasting system has two different methods for constraining LAM initial conditions with large scale information from the host model. The first method, referred to as LSMIXBC, is the default option of the HARMONIE system used in the operational NWP. The LSMIXBC scheme provides a weighted average, in spectral space, between the host model fields and the short-range high-resolution HARMONIE forecast fields. This obtained model state is then used as background in variational data assimilation. The weighting procedure is empirically determined, and height and horizontal scales dependent. The optimal configuration of the scheme was deduced based on test results conducted in some member services of the HIRLAM consortium (see section 3b in Müller et al, 2017a).

The second method, referred to as a large scale error constraint, adds an additional term, usually abbreviated as Jk, to the cost-function of the variational data assimilation. The term measures the distance between the initial conditions and the host model state according to a certain matrix. The formulation of the term follows one from Guidard and Fischer (2008).
During the preparation phase of CARRA, both LSMIXBC scheme and large scale error constraint scheme (Jk) were evaluated with host model provided by the global ERA5 analyses. A number of one-month validation experiments for both of the CARRA domains for different seasons and different years were set up for inter-comparison, see Bojarova et al (2019) for more details. From these experiments, it is observed that, for the optimally tuned Jk configuration, it is able to deliver the initial fields for HARMONIE-AROME forecasts of a similar or better quality than the LSMIXBC scheme. Such an improvement is particularly obvious for the upper air humidity fields which are not treated in the current implementation of the LSMIXBC scheme. At the same time, a strong impact of the Jk term on the solution of the minimization problem is observed, which suggests needs for a longer monitoring to evaluate and eventually to tune the scheme. Due to time constraints the default option for treatment of the large scale error (LSMIXBC) is selected for CARRA production, while the Jk large scale error constraint will be considered for future versions of the reanalysis.

Derivation of background error statistics

The background error constraint plays an important role in initializing an NWP model with observations. This is especially important in the Arctic regions where the available number of observations is relatively sparse. Numerical experiments have been done to select the optimal way, from different alternatives, for generating background error statistics for CARRA.
HARMONIE-AROME uses a background error statistics model formulated by Berre (2000). The approach is based on spectral representation of spatial covariances. A time series of forecast differences is taken as a proxy of background forecast error and parameters of background error model are estimated. For CARRA application, the differences between ensemble members driven by ERA5 Ensemble of Data Assimilations (EDA) have been computed to sample the time series of forecast differences. The inverse of the square-root of the background error covariance is used to transform model state variables from the "physical space" to the "control vector space", the elements of the latter are required to be statistically independent.

A two-step iterative procedure has been applied in the generation of the background error covariance, which is used to represent error growth on short time scales. First, the "preliminary" background error covariance (one per CARRA domain) is generated from short range forecasts initiated from interpolated ERA5 EDA reanalyses. The structures of the background error covariance obtained from such downscaled ensembles tend to suffer from spin-up issues due to, for example, adjustment to high-resolution orography, which may last for up to 9-12 hours of forecast lead time. They are hence incapable to represent adequately the background errors at short forecast ranges desirable for the CARRA reanalysis, which runs 3h data assimilation cycling. Instead, such obtained "preliminary" background error covariances are used to generate time series of the high-resolution forecast ensemble with cycling of high-resolution HARMONIE-AROME 3DVAR, in which the large scale ERA5 EDA is only used as lateral boundaries. Such high-resolution ensembles of the 3h forecasts are collected over a 3 week period, with the first ten days of simulations discarded to minimize the effect of spin-up in a cold-start situation, in which some adjustment is expected to "warm up" surface states. This high-resolution ensemble is then used to compute the ensemble differences as a proxy of the CARRA reanalysis background errors, and statistics derived from these are then used as background error statistics for the CARRA reanalysis. It turns out that ca 800 ensemble members is adequate to produce a statistically stable estimate of the background error covariance.

Figure 2.2.2.1 shows a schematic representation of the background error covariance model and the two different approaches available in HARMONIE-AROME to form short range forecast ensembles. The first of such, "EDA" approach, inserts uncertainty into observation space, on scales of the observing system, and uses data assimilation as a device to create analysis increments; in this way the uncertainty from observation space is spread into physical space of the model state. The second approach, "BRAND", on the other hand, does not perturb observations but samples the uncertainty directly in the control vector space perturbing the whole range of scales. Afterwards the uncertainty from the control vector space is translated into the uncertainty in the physical model space using background error covariance model. In both cases, the obtained perturbation in the physical space is added to the short range high-resolution HARMONIE-AROME forecast fields to obtain the initial fields for the ensemble forecasts. This two-step procedure for the generation of high-resolution forecast differences ensures a more realistic representation of background errors for short range forecasts (Bojarova & Gustafsson, 2019).

Figure 2.2.2.1: Schematic overview of the background error covariance representation in the HARMONIE-AROME forecasting system showing the "EDA" and "BRAND" approaches for perturbations to generate ensembles as proxies for background errors.

In order to perform a comprehensive evaluation of "EDA" and "BRAND" approaches, 2 domains x 2 months x 2 derivations = 8 sets of the background error statistics were derived. These sets were evaluated both from the physical soundness of the computed structure functions and based on forecast quality verification of the validation runs (1 month period for winter and 1 month period for summer over both domains) with the use of the above two sets of alternative background error statistics. From validation against observations, the results appear to be largely comparable between the two options, with "BRAND" based structure functions generally more favorable for the humidity parameters. See more details about the diagnosis and results comparing the two approaches in the report on CARRA test and evaluation (Yang et al, 2019). After careful examinations, the "BRAND" based structure functions were selected for the CARRA reanalysis due to a more physical realism.

As expected, the background error statistics for the CARRA reanalysis derived either with "EDA" or "BRAND" approaches reveal little diurnal dependency, but a strong seasonal variability. The latter is associated with different motion scales as well as differences in dominating air masses, and hence weather regimes, during different seasons. To account for such a seasonal variability, a smooth seasonal variation of climatological structure functions has been introduced in CARRA, which is made via a weighting function that computes background error statistics which changes daily by interpolating between the structure functions in winter and summer.

$$B(day) = w(day) \ast B_{winter} + (1-w(day)) \ast B_{summer} \\ w(1 January) = 1; w(1 July) = 0;$$

2.2.3. Improved and extended use of physiographic data

The default physiography input in HARMONIE-AROME 40h1.1.1 includes surface cover types from the ECOCLIMAP/ECOCLIMAP-II datasets (Masson et al. 2003, Faroux et al. 2013) in 1 km resolution, the Global Multi-Resolution Terrain Elevation Data 2010 (GMTED2010) topographical elevation files (Danielson and Gesch et al. 2011) in 7.5 arcsecond resolution, and sand and clay soil fraction data from the Harmonized World Soil Database (HWSD) (Nachtergaele et al. 2010) in 1 km resolution. In CARRA, these data have been made consistent, e.g. in places where their land-sea masks did not match. In remote areas in the Arctic these databases have shown some issues and inaccuracies, as such, further corrections and improvements have been necessary and are described below.

For Iceland, new surface cover types have been added to ECOCLIMAP that reflect the special cover types of volcanic origin in Iceland. The ECOCLIMAP lake database has also been updated. Furthermore, local sand and clay data are used instead of the default. In Figure 2.2.3.1 some examples of these local updates are shown for a specific time of the year. ECOCLIMAP includes climatologies for the relevant variables. These are assumed to be the same from year to year.

Figure 2.2.3.1: Examples of the ECOCLIMAP soil depth (left), vegetation fraction (middle) and leaf area index (right) are shown, with the top row with the original ECOCLIMAP data as in the HARMONIE-AROME, the bottom with the local improved data. Darker colours show higher values and white is zero.

For Greenland, corrections have been made to the land-sea mask, in particular for fjords in the northernmost parts. These corrections are based on coastline and lake data from the Danish Map Supply. Major glacier extent corrections in Northeastern Greenland as used in the Iceland-Greenland operational HARMONIE-AROME are implemented here as well. The glacier extent data in the Danish Map Supply data is of good quality but based on decades old data. Therefore, these have been updated with data from the Greenland Ice Mapping Project (GIMP) (Howat et al. 2014) and OpenStreetmap (Haklay & Weber 2008). The glacier extents are fixed for the whole reanalysis period (1997-2021) at extents as they were approximately in the middle of the period. In Figure 2.2.3.2 the overall effect of the coastline corrections is shown relative to the OpenStreetmap data. Differences in the updated coastline reflect floating glacier tongues that have broken off in recent years. These recent changes, we do not include, since we used fixed glacier extends corresponding approximately to the middle of the reanalysis period.


Figure 2.2.3.2: Comparison of the northernmost Greenland ECOCLIMAP coastline (land areas are gray and sea areas are cyan) and lake (blue areas) physiography with the satellite-derived coastline as given in the OpenStreetmap data (shown with the black lines). The upper plot shows the original ECOCLIMAP data as in 40h1.1, the lower plot shows the corrected and updated data.

For Greenland and areas in Arctic Canada the Leaf Area Index (LAI) climatology has been updated. In Figure 2.2.3.3 examples of these updates LAI data are shown for three days during summer. These LAI data are based on MODIS MCD15A2H C6 multi-year mean values (Yang et al. 2006; Yuan et al. 2011) and have been used to update the ECOCLIMAP cover types for Greenland as shown in Figure 2.2.3.4.


Figure 2.2.3.3: MODIS MCD15A2H v.6 Leaf Area Index (LAI) climatology maps (multi-year means) for Greenland and parts of Arctic Canada centered around three days during summer. Data used to generate seasonal change of LAI for different land-cover classes in ECOCLIMAP-II.


Figure 2.2.3.4: Updated ECOCLIMAP-II cover types based on the Danish Map Supply and MODIS LAI multi-year mean on July 28th.

The topography in Greenland has also been updated using the ArcticDEM v7 dataset (Porter et al. 2018), which has a very high resolution for arctic regions, e.g. up to 2 m for Greenland). This dataset does, however, have several holes and local errors that have been filled and corrected with data from the following sources: ArcticDEM v6, bi-linearly interpolated ArcticDEM v7 and AsterDEM (Hirano et al. 2003). The filling of the gaps was based on a visual inspection of 171 tiles along the Greenlandic coast, and further comparison to the TanDEM-X dataset (Krieger et al. 2007). The resulting elevation map is shown in the right hand part of Figure 2.2.3.5.

Figure 2.2.3.5: Left: Greenland albedo data from a specific day during summer shown with grayscale colours, where black is an albedo of 0.0 and white is an albedo of 1.0. The albedo data over the glaciers is based on daily MODIS 10A1 C6 data, while the albedo data over land areas beyond the glaciers depend on the ECOCLIMAP 10-day climatologies derived from MODIS MCD43A3 C6. Right: The gap-filled and corrected ArcticDEM v7 elevation map for Greenland.

The ECOCLIMAP-II tree height data for Scandinavia have been improved, as shown in Fig 2.2.3.6, with national data from Sweden, Finland and Norway based on LIDAR measurements (Simard et al. 2011; Samuelsson et al. 2018). Note that these tree heights, as used in ECOCLIMAP, for consistency also are given for cover types that do not include trees such as rocks and bare land. The momentum roughness length (z0) for vegetation is calculated as 0.13 times the tree height.

Figure 2.2.3.6: Left: The original ECOCLIMAP-II tree height data for Scandinavia. Right: The updated tree height data based on space-borne and national LIDAR data. The unit of the data is meters.

Figure 2.2.3.7: The original ECOCLIMAP-II glacier mask for Svalbard (left), and the corrected glacier mask for Svalbard (right). The dark blue areas have glacier fractions of 90%-100%, the light gray areas have 0%-10% glacier fraction, and the other blue nuances show partial glacier cover.

The glacier extents in Svalbard and the Russian Arctic in HARMONIE-AROME 40h1.1.1 required several corrections. The default glacier extents for Svalbard shown in the left-hand plot in Figure 2.2.3.7 go far beyond both historical (in the recent centuries) and contemporary glacial cover. The updated version, shown in the right-hand plot in Figure 2.2.3.7, is based on data from the Norwegian Polar Institute (2014) and the Randolph Glacier Inventory (RGI consortium 2015). This glacier update also changes the coastline of sea-terminating glaciers, which made it necessary to update the land-sea mask. Finally, the ECOCLIMAP-II cover type "rocks" now replaces the land areas from which the glaciers have been removed.
Due to national inconsistencies in Scandinavia the Harmonized World Soil Database (HWSD) soil sand and clay data as used in HARMONIE-AROME 40h1.1.1 have been replaced with data from SoilGrids (Hengl et al. 2017). This is done for all areas except Iceland, where local soil data are used. These data have all been downscaled to the 1 km resolution HWSD data grid to fit into the current modelling system, and a map of these data in the West domain is shown in Figure 2.2.3.8.

Figure 2.2.3.8: The updated soil sand fraction for the CARRA-West domain. The darker grays show higher sand fractions. For Iceland local data are used, while SoilGrids sand (and clay) fraction data are used elsewhere. SoilGrids data are also used in the CARRA-East domain (not shown here).

Finally, snow roughness lengths in the HARMONIE-AROME 40h1.1.1 surface physics module SURFEX v7.3 (Masson et al. 2013; SURFEX 2016) are increased from 1 mm to 3 mm for momentum and from 0.1 mm to 0.3 mm for heat. The increase in the snow roughness lengths addresses a positive wind bias over snow-covered surfaces that is assumed to arise from larger scale obstacles formed and covered by the snow (Samuelsson et al. 2018).

2.2.4. Glacier snow initialization, glacier albedo scheme and sea ice

The current version of HARMONIE-AROME does not include a glacier ice model. Therefore, glaciated surface should be permanently covered with snow. Not doing this gives large model errors over the glaciers (Mottram et al. 2017). To avoid this, snow depth of glacier grids is initialized at each cold-start and re-initialized on each September 1st so that the snow water equivalent is 9990 kg/m² where the glacier fraction is larger than 0.99. Elsewhere the snow water equivalent is then reduced to 100 kg/m² if it is higher than 100 kg/m². In non-glaciated areas with less than 100 kg/m² snow water equivalent, no changes are done on September 1st. Where a model grid box has a glacier fraction less than or equal to 0.99, the snow scheme is modified so that the minimum snow water equivalent is a function of the glacier fraction and snow can melt there but never go below the minimum value.

To describe as reliable as possible the radiation coupling over the glaciers between the surface and the atmosphere, the model is adapted to use satellite-derived albedos over the glaciers. These external albedo data are not used directly, rather, they are used as the old-snow albedo minima in the Douville et al. (1995) snow scheme. Thus, freshly fallen snow in the model can affect the snow albedo and inconsistencies in this regard are avoided. The glacier albedo input data is described in section 2.2.5 below. The snow scheme has further been modified to account for excess aging of snow albedo in cold conditions and for too high maximum albedo during melting snow conditions (pers. comm. Willem Jan van de Berg, KNMI, 2015). The excess aging in cold conditions is reduced by setting the Douville parameter τa to zero for snow surface temperatures less than or equal to -5℃. The original Douville value is used at 0℃ and is linearly changing in between 0℃ and -5℃. During melting conditions the maximum snow albedo, αmax, is set to 85% of its original value.
As for the glacier albedo data, the model has been adapted to be able to use satellite-derived sea surface temperatures (SSTs) and sea ice concentrations (SICs) directly as input. These data are detailed in section 2.2.5.

For the modelling of sea ice, the SICE scheme (Batrak et al. 2018) is used in the HARMONIE-AROME modelling system.

2.2.5. Surface information on glacier and albedo

Initially it was planned that external albedo data should only be used for exposed glacier surfaces and not those with snow cover. As described in section 2.2.4, the glaciated surfaces are initiated with 9990 kg/m² snow water equivalent each year on September 1st. When this is reduced to be less than 9990 kg/m², mainly during summertime, the glacier surface will be exposed and external albedo data is used. The albedo complexities of glacial ice (e.g. Ryan et al. 2016) are not trivial to simulate, but in a reanalysis they can be prescribed from satellite data. Correct glacier albedos are essential to producing realistic ablation rates (Langen et al. 2017). After testing both the default (Douville et al. 1995) and the more advanced CROCUS (Brun et al. 1992; Vionnet et al. 2012) snow albedo schemes, it is found that both produce too dark albedos over the snow-covered glacier parts associated with assumptions of snow-aging and pollution when no fresh snowfall occurs. The reason for this is that the CROCUS albedo scheme is designed for the conditions in the Alps and not for the conditions in the Polar Regions (pers. comm. Descharme, B., 2018; pers. comm. Brun, E., 2018). In view of these factors, satellite-derived albedo data are applied for all glaciated areas in CARRA - for both snow-covered and exposed glacier surfaces. The satellite-derived albedos are not used beyond the glaciers.

Daily gap-filled 500 m resolution broadband albedo are used for all glaciers in both of the CARRA domains with the exception of the smaller glaciers in Northern Scandinavia. The MOD10A1 C6 product from the Moderate-resolution Imaging Spectroradiometer (MODIS) on the Terra satellite is used for the 20 year period from 2000 to 2019. The Geological Survey of Denmark and Greenland (GEUS) provides a gap-filled and denoised albedo product (Box et al. 2017). This has been tested against in situ measurements on the Greenland ice sheet and found to have a root mean square difference of 0.08 for stations in the ablation zone, and 0.05 for stations in the ice sheet accumulation area (Box et al. 2017). For comparison, in ERA5 the forecast albedo for all glaciers is assumed to be 0.85. In Figure 2.2.5.1 and in the close-up in Figure 2.2.5.2, it is illustrated how much the satellite-derived albedo deviates from that constant value. Also in the left-hand plot of Figure 2.2.5.1, it can be seen that the glacier extents in northern and eastern Greenland and in Svalbard are excessively large. In the left-hand plot in Figure 2.2.5.2 an example of the albedo data for Greenland are shown in grayscale.

Figure 2.2.5.1: Comparison of albedos from the 31st of July 2012 as used in ERA5 (left) and as available from the MOD10A1 C6 product (right). Both plots have the same colour scale as shown in the lower right corner. Only the MOD10A1 C6 albedos over glaciated areas are used in the CARRA reanalysis.


Figure 2.2.5.2: A close-up of the ERA5 and MODIS 10A1 C6 data also shown in Figure 2.2.5.1. The region shown from south to north covers the Greenlandic cities of Nuuk, Sisimiut and Aasiaat.

For the CARRA period, the glacier albedo for the period 1996-1999 is approximated by a multi-year average daily MOD10A1 C6 value which is derived from data during the period 2000-2006. Such data from MODIS will be available at least until the winter of 2019/20, but there is some uncertainty regarding later availability, with increasing risk of failure or decommissioning of the system. For the period from 2017 onwards, glacier albedos derived from Sentinel-3 Ocean Land and Colour Instrument (OLCI) data (Kokhanovsky et al. 2019) are available on a daily basis with 300 m resolution. These ESA Snow and Ice data are available in near real-time with under 24 h latency from the end of 2019. They are available from the CRYOCLIM, PROMICE and PolarTEP data portals. We hope to continue with MODIS data until 2021 for consistency; otherwise use of the Sentinel-3 OCLI data for the final period is believed to be a good backup solution.

2.2.6. High resolution ocean surface data for sea surface temperature and sea ice cover

The CARRA reanalysis uses specially produced high resolution data sets for Sea Surface Temperature (SST) and Sea Ice Concentration (SIC), which are derived from several reprocessed satellite products with both regional (the Baltic Sea) and global coverage. The satellite-based products are derived as are gap-filled Level 4 (L4) interpolated fields, and interpolated onto a 0.05° regular latitude-longitude grid from 56N to 86N and 110W to 90E. This region covers both model domains in CARRA. Table 2.2.6.1 provides an overview of the SST and SIC data products which have been used for the CARRA SST and SIC input.

Two different SIC products have been used: the European Space Agency Climate Change Initiative (ESA CCI) SIC product (SICCI; Toudal Pedersen et al., 2017) and the EUMETSAT OSISAF SIC product, OSI-450 (Tonboe et al., 2016). These SIC products have been developed using the same algorithms and processing chains. However, the OSISAF SIC product is produced by using information from the coarse resolution (30-60 km) passive microwave instruments available since October 1978: Satellite Multichannel Microwave Radiometer (SMMR), the Special Sensor Microwave / Imager (SSM/I), and the Special Sensor Microwave Imager and Sounder (SSMIS), while the SICCI product is based on the medium resolution (15-25 km) microwave sensors: the Advanced Microwave Scanning Radiometer - Earth Observing System (AMSR-E; 2002-2010 on board Aqua US satellite) and the Advanced Microwave Scanning Radiometer 2 (AMSR2; 2012-today on board GCOM-W1 Japanese satellite). For more information on the SIC products and the underlying algorithms see also Lavergne et al. (2019). In the CARRA reanalysis, the SICCI product is used whenever available (/- 5 days), else the OSISAF product is used to fill the gap (/- 5 days). Lists of the missing data are available in the product user guides.

Traditionally, SIC data sets derived from passive microwave observations have challenges in the coastal regions (see e.g. Lavergne et al., 2019). Therefore, within the Baltic Sea, Kattegat Bay and the Skagerrak Strait (eastward of 8E; referred to as the Baltic region from here) the SST and SIC fields are derived from the Baltic Sea SST and Sea Ice L4 analysis (Høyer and Karagali, 2016) and the Baltic Sea operational SST and SIC L4 analysis (from the Copernicus Marine Environment Monitoring Service, CMEMS, see Høyer, 2016), which include high resolution sea ice information from the Swedish and Finnish ice services.

The baseline SST product (used outside the Baltic region) is the ESA SST CCI Analysis Long Term Product (Merchant et al., 2016), which covers the period 1991 to 2010. From 2011 and onwards the Operational Sea Surface Temperature and Sea Ice Analysis (OSTIA; Donlon et al., 2012) is used for SST input outside the Baltic region.
To aid the analysis, another SST product has also been included from the Canadian Meteorological Center (CMC). The product merges infrared satellite SST with microwave data and in situ observed SST from drifting buoys and ships (Brasnett, 2008; CMC, 2012). The CMC SST is not used directly as input for the CARRA reanalysis but is used for filtering the SIC data. The filtering of the SIC datasets is described below.

Table 2.2.6.1: Overview of SST and SIC input data in CARRA

The SIC fields have been extrapolated along the coasts to cover the fjords with reasonable values. In the processing chain SIC<15% is defined as no ice for the Baltic Sea ice, the OSISAF and the SICCI product. The interface occurring, when merging the Baltic Sea SST product with the ESA CCI SST product, has been smoothened by averaging along the interface. The SST and SIC consistency analysis performed in relation to the development of the CARRA reanalysis showed that L4 SST can be a beneficial filter to remove spurious ice in the sea ice products. The reason is that the SST fields in general have fewer coastal issues, due to the use of higher resolution infrared observations. Sea ice (outside the Baltic region) is removed if the SST exceeds a given threshold in either of the two global SST products (ESA CCI SST and CMC). The SST threshold differs slightly between the two products (depending on the SST average and trend within the domain for each product) and varies linearly over time from about 1.5 to 2.5°C for the period 1995-2017.

Figure 2.2.6.1 shows the number of cases with SIC>15%, which are removed due to the SST filtering described above, for the period 2003-2010, using the SICCI product and the OSISAF product, respectively. Sea ice is typically removed in coastal areas and in the marginal ice zone (MIZ). Especially, the OSISAF product is contaminated by land spillover effects in coastal areas (see e.g. the Barents Sea, Iceland, Disko Bay and Svalbard) due to the coarser resolution sensors, and these effects are removed by the SST filter in the final CARRA SIC product.


Figure 2.2.6.1: Number of observations (Nobs) with sea ice removed by the SST filter during 2003-2011 for the SICCI product (left) and the OSISAF product (right), respectively.

The resulting CARRA sea ice extent (SIE) has been compared to the SIE of the OSISAF and SICCI products, and the gap in the SICCI record October 2011 to July 2012 has been highlighted to evaluate the consistency of the SIE in the CARRA product during the transition between the SICCI and OSISAF product, see Figure 2.2.6.2. The CARRA SIE shows good agreement with the SICCI SIE throughout the time period. No abrupt changes in SIE are visible at the time of the SICCI/OSISAF transition, where the CARRA SIE stays at similar lower-level when compared to OSISAF, as in the previous and later years.


Figure 2.2.6.2: Sea ice extent in the period 2002-2017 for OSISAF, SICCI and the CARRA product, respectively. The dashed ellipsis in the upper panel indicates the period of missing SICCI data October 2011 to July 2012. The encircled period is found in larger scale as the middle peak within the lower panel.

A mask was created to remove spurious sea ice concentrations of 30-40% in Breiðafjörður, Iceland during March and April.
In view of the fact that the high resolution sea states time series are sometimes patched products with different sources of data, several parallel reanalysis experiments with CARRA have been made to evaluate the sensitivity and impact to analysis quality. The periods selected are for May-July 2002 and May-July 2012, both of the period went through change of main data sources from OSISAF to ESA CCI. From these investigations, it is concluded that the patched data sets do not reveal noteworthy differences in error trend during data transitions, for the key reanalysis parameters such as T2m.

2.2.7. Enhanced surface temperature and moisture observation data from non-GTS sources

For CARRA surface data assimilation, station data of 2-metre relative humidity, 2-metre temperature and snow depth are used. Input data consists of those retrieved from the database of the global ERA5 reanalysis and additional data collected from national databases of the Nordic meteorological services and other external sources, see below. The ERA5 database collects, amongst others, all observations that the National Weather Services (NWS) provide to WMO's Global Telecommunication System (GTS). However, often there exist higher temporal resolution time series and additional stations in the local archives of NWSs and external data providers. Furthermore, the NWS services and other data providers usually conducts posteriori considerable quality control, which typically do not reach GTS. For the regions covered by the CARRA reanalysis, the surface observation network in many of the areas is extremely sparse. Thus, efforts have been devoted to collect as much as possible non-GTS data for both assimilation and verification purposes. For the domain of CARRA-West, e.g., additional station data has been collected via DMI for the networks of PROMICE (Programme for Monitoring of the Greenland Ice Sheet, Denmark), ASIAQ Greenland Survey, and GC-NET (Greenland Climate Network). At DMI, wherever available, an efforts have been put to quality assure numerous station observation time series along Greenland coast, resulting a much improved quality and consistency for surface observations of pressure and temperature. A data merging procedure is set to incorporate these data with the ones obtained from the ERA5 archives. In case where both ERA5 and local data are present, the local observations are chosen by preference. This is, e.g., the case for numerous coastal Greenland stations managed by DMI.

For some of the non-GTS stations it was an issue that a station identification is needed in the system, but they had not been assigned any WMO-station ID. In such cases pseudo-WMO-station numbers were assigned for digestion in the CARRA surface data assimilation system and in verification. The station IDs are assigned typically with 2 to 5 digits, and should ensure that the final ad hoc station ID will not be classified into stations groups belong to other regions. The following naming convention is used:

Pseudo-WMO-station ID= Country-specific-number + Local-station-ID

The country-specific part of the ID numbers are as follows for observations in the various countries:

    • Norway: 100 000
    • Sweden: 2 000 000
    • Finland : 2 700 000
    • Iceland : 4 000 000
    • Greenland : 4 200 000

Figure 2.2.7.1: Temperature observations from stations of the national databases (red dots) and of the ERA5 database (blue circles) for January 2000 over the CARRA-West (left) and CARRA-East domains (right).

In Figure 2.2.7.1, one can see the additional contribution of the national databases (red dots) to the ERA5 database (blue circles) for January 2000 over CARRA-West (left) and CARRA-East domains (right). For more recent years, the NWS of Norway, Sweden and Finland deliver also their high-resolution station network to GTS. Thus, the difference between the national databases and ERA5 database becomes less pronounced over those countries during the later stages of CARRA period.

2.2.8. Additional snow observation data in surface analysis

In HARMONIE-AROME 40h1.1.1, the presence of snow on the ground and the amounts of snow is modelled by a one- layer snow scheme Douville et al. (1995), with snow water equivalent (SWE), snow density and snow albedo as prognostic variables. The SWE is adjusted to observed snow depths in the snow analysis to correct for deficiencies in the snow scheme or forcing (precipitation, temperature). The snow analysis is performed by CANARI (Taillefer, 2002) and based on the Optimal Interpolation method. Only conventional snow depth observations are used in the present operational HARMONIE-AROME. The snow analysis is performed once daily at 06 UTC when most snow depth observations are available. The performance is good in regions with representative observations. In view of the extremely limited coverage of conventional observations in the CARRA domains outside of the European continent, e.g. Greenland, Iceland and Svalbard, use of a satellite snow observation data set becomes relevant for CARRA.

For conventional snow depth observations, reporting practice varies from country to country. Some countries (e.g. Germany) deliver hourly observations on GTS. Other countries (e.g. Finland, Sweden and Norway) reports mostly only once a day, at 06 UTC. Iceland reports only a few snow depths at 06 UTC, and many more at 09 UTC and later on the day. Greenland stations have no snow depth observations. The ability or willingness to report "0" snow depth observations in summer (instead of missing values) also vary. The "0" observations are of great value from the NWP point of view, as it is more important to discriminate between no snow/snow than to tell whether there is 1 or 3 meters of snow. For CARRA, efforts have been devoted to improve the availability of snow observations, including

  • national data that were not delivered to GTS;
  • more "0" observations in the melting season and summer;
  • more quality control on snow depth observations extracted from local databases.

Some countries have in addition to the network of synoptic stations also a network of stations observing e.g. precipitation and often also snow depths. On an earlier initiative from ECMWF, snow depth observations from these networks have been made available on GTS; from Swedish "climate" stations since December 2010 and from Norwegian "precipitation" stations since March 2013. The increased number of snow depths observations from North Scandinavia on GTS is visualized in Figure 2.2.7.1 showing observations used in ERA5 as blue circles and observations available for use in CARRA as red circles for January, 2000 (left) and January, 2017 (right).

Figure 2.2.8.1: Snow depth observations used in ERA5 (blue) and in CARRA (red) for January, 2000 (left) and January, 2017 (right).

A few years ago, assimilation experiments with satellite snow extent in addition to snow depth observations were performed with HARMONIE-AROME. "CryoRisk Snow extent" based on AVHRR data was used in the HARMONIE snow analysis in analogy with the way ECMWF uses NOAA/NESDIS Snow Extent data. From the experiments with CryoRisk for the melting season (March-May), it showed improved snow extent and also T2m scores (Homleid and Killie 2013). A corresponding satellite snow extent product has been produced in the CryoClim project and is available daily at 5 km resolution for the period 1982-2015. CryoClim is a global, optical snow product with 5 km resolution based on historical AVHRR GAC data (A2 FCDR). A Bayesian approach is used to combine information from optical and infrared AVHRR channels to estimate the probabilities for the classes "snow", "snow-free land" and "cloud". The data set is accompanied with a set of statistical coefficients giving mean and standard deviation values for each class and each AVHRR spectral band or a combination of spectral bands. These coefficients are derived from training data mainly from a large number of manually classified satellite images. Each swath is processed individually. In a second step, the individual swaths are gridded and averaged to daily products. More details are found in Solberg et al. (2017) and Killie et al. (2018).

The EUMETSAT H-SAF snow extent product H-32 (Metop/AVHRR) was also considered for use in CARRA. H-SAF achieved operational status in the autumn 2017. Reprocessed data is available since January 2015. The algorithm uses all daily multichannel images from AVHRR on board Metop satellites to classify land pixels as snow covered, partially covered by snow or snow free. More information about the product is available on the LSA SAF web site (https://landsaf.ipma.pt/en/); Product User Manual (PUM), Algorithm Theoretical Basis Document (ATBD) and Validation Report (VR). Comparisons of CryoClim and H-SAF for the spring 2016 show that the products agree quite well on the snow/no snow decision, but might differ in the decision of classified/not classified. For CARRA production, the CryoClim data is ultimately selected in view of its extended coverage for the whole period until 2015. For the period after 2015, arrangement has been made with MET Norway so that data for the later period is also produced.

The Surface scheme SURFEX 7.3 as used in Harmonie 40h1.1.1 does not include a separate glacier model, thus the 1-layer snow scheme is used also over glaciers. As mentioned in 2.2.4, the extent of the glaciers coverage is corrected and updated, and the initial amount of snow (SWE) on the glaciers is set to 10 tons/m2 to avoid snow free glaciers and unrealistically hot surface during melting season. The snow amount on the glaciers is also reinitialized yearly at 1. September with 10 tons/m2. For the SWE amounts on the glaciers, snow analysis is not applied.

2.2.9. Extensions in upper-air assimilation on use of conventional observations

Radiosonde bias correction:

ECMWF has developed a bias correction on radiosonde temperatures to account for long term drifts in the observations. These have been derived from departure statistics and by comparing to neighboring stations, see Haimberger et al. 2012 and Haimberger et al. 2017. For CARRA, the same corrections are applied. In view of technical reasons associated with the use of different code versions, the correction is implemented in the 3DVAR screening procedure, in contrast to the case in ERA5, which applies the procedure in the minimization.

Blacklisting:

When using historical observations it is important to utilize prior knowledge of the quality of the observations going into the data assimilation. E.g. for each satellite instrument the data may need to be blacklisted in some circumstances. For conventional data, particular aircrafts or synoptic stations can be blacklisted. In order not to "reinvent the wheel", the blacklist information from ERA5 is used, which is an accumulation of knowledge about the global observation system based on the experiences with several phases of the global reanalyses at ECMWF in assimilating full range of observation data. As there are differences in the way ECMWF and HARMONIE-AROME uses observations, the blacklist data from ERA5 needs to be adapted for the latter, mainly to retain the historical knowledge of each observation platform from ECMWF. On the other hand, the blacklisting framework is flexible so that additional listing on local observations as identified within the monitoring process can be conveniently included. The blacklist data from ERA5 consists of 3 files covering 3 different time periods. For CARRA, these files are merged into one so that one set of binary executables could be used by all production streams. To achieve this, the HARMONIE-AROME build system (makeup) was adapted to it.

2.2.10. Improved use of satellite observations

HARMONIE-AROME is designed for operational real time NWP. To carry out reanalysis on the ECMWF HPC, the procedure used in ERA5 to fetch observations ("fetchobs") from both ECFS and the MARS archive was implemented in HARMONIE-AROME, with the adaptation that retrieval of different data types for both conventional and satellite data is done by separate scripts.

Observation data in MARS and ECFS are stored in 6 hour time slots, plus/minus 3 hours from 00, 06 12 and 18 UTC. The fetching procedure for CARRA retrieves these 6 hourly data from the archives and concatenate observation files for intermediate cycles at 03, 09, 15 and 21 UTC. For example, for the 09 UTC cycle, observations from 06 and 12 UTC are concatenated. Then, the system is able to select the observations, which are in the assimilation window. Figures 2.2.10.1 and 2.2.10.2 show the availability of the microwave radiance sensors and satellite wind retrieval datasets, respectively, during the CARRA reanalysis period.

CARRA assimilates bending angle from reprocessed GPS radio occultation data sets as provided by the Radio Occultation Meteorology Satellite Application Facility (ROM SAF), which is a decentralized operational RO processing center under EUMETSAT. The RO data as used in CARRA is available at: http://www.romsaf.org

The processing of the scatterometer data in HARMONIE-AROME differs from that in ERA5. The data sets are collected in advance from the EUMETSAT OSISAF. Reprocessed climate data records (CDR) for ERS Scatterometer (OSI SAF, 2017), OceanSAT/OSCAT (OSI SAF, 2018), QuikSCAT/SeaWinds (OSI SAF, 2015) and Metop/ASCAT-A (OSI SAF, 2016) are selected in order to achieve consistent data quality. For the time after April 2014, operational near-real time Global OSI SAF "coastal" wind product based on Metop/ASCAT (OSI SAF, 2019) is utilized. The data is first stored at ECFS and fetched by a separate script during pre-processing step in CARRA cycling. The processing and assimilation of scatterometer data in HARMONIE is described in Valkonen et al. 2016.

For microwave and IR sounding data, due to the relatively low model top (10 hPa), only channels 5 to 9 from AMSU-A and channels 3 to 5 form AMSU-B/MHS are selected for assimilation. Details about the use of the ATOVS radiances in HARMONIE over different scenes and atmospheric conditions, as well as the assimilation of IASI radiances, is described in Randriamampianina et al. 2019. The MSU radiance observations (channels 2 to 4) are selected following the ERA5 decision as described in the blacklisting file. The radiance data is bias corrected in variational way (VarBC, Auligné et al. 2007), where the coefficients are aggregated and updated daily for each assimilation time following Randriamampianina et al. 2011.

In CARRA system, the AMSU-A data is assimilated with the dynamic emissivity scheme in screening, which is seen to give more observations through screening and minimization, both over land-ice and sea-ice. This is built on methods which retrieve surface emissivity from window channel observations with help of atmospheric profile and surface temperature information from the first guess used in assimilation (see Karbou et al. 2010a, 2010b, Bormann et. al. 2017). When the dynamic emissivity retrieval fails, the alternative use is a set of monthly emissivity atlases. These atlas files are provided by Philippe Champon and Florian Suzat in Meteo France. One advantage with these atlases is that one can take into account seasonal variability of surface emissivity by changing of atlas every month. Validation experiments for the CARRA-West domain has been set up to evaluate the sensitivity with use of the emissivity atlases. Based on these investigations, dynamic emissivity is used for AMSU-A data over sea and sea-ice. The use of the surface emissivity atlases allows the assimilation of low peaking channel, in this case channel 5, from microwave radiances over sea ice.

Figure 2.2.10.1: Overview of availability of the microwave radiance observations in CARRA. The availability of IASI infrared sounder data is identical to the METOP-A and METOP-B time coverage for AMSU in the figure.

Figure 2.2.10.2: Availability of retrieval wind (scatterometer and AMV) observations in CARRA. (Time axis subdivision is somewhat coarse.)

2.2.11. Verification and monitoring system

The backbone of the quality assurance measures in CARRA is the HARMONIE utility packages GL/MONITOR, which provides tools to monitor data assimilation and verify model results to those of observations. For verification, the MONITOR tool is used for quality check of both the near-surface and the upper-air parameters against in-situ data on the synoptic and radio sondes observation network. MONITOR is a self-contained, stand-alone package dealing with pre-extracted model and observational data for a set of sites in a given station list. With these datasets, a wide range of verification metrics are calculated for a set of parameters, regions, lead times etc. For CARRA, a number of adjustments have been made, the main ones including;

  1. Setting up an extended station-list so as to enable quality check of model data against observations with both GTS-distributed and locally collected observations for the full CARRA period.
  2. Conversion of local observations into the MONITOR format and also merge with observations extracted from BUFR-files in the MARS archive.
  3. Extraction of ERA5 data for the extended station lists.
  4. Merge of extracted CARRA on observation sites between the two domains. For area with overlapping, data from CARRA East domain is used to consider boundary effects.
  5. Dividing verification sites into different categories in order to better highlight strengths and weaknesses of the model systems (e.g. sub station lists for coast, fjords, inlands, mountains which may reveal different verification behavior). One particular category of such is the group of stations that have been operating for the full CARRA period, which can be used to monitor the evolution of forecast quality through the whole period in a homogenous way.

To complement the MONITOR verification, a set of R-scripts have been prepared to produce time series (quality evolution through the full CARRA period), scorecards (summarizing verification and include information on the significance of verification results) and variograms (compare spatial correlations in observations, CARRA and ERA5). In addition, scripts have been prepared to download and make CARRA and ERA5 fields available for in-house visualization programs to allow for the possibility for efficient case studies and simplified field diagnostics. Most of the verification statistics figures to be presented in the next section have been produced with this system.

3. System evolution and quality assessment

During the preparation stage of CARRA, several phases have been gone through to adapt the baseline system HARMONIE-AROME 40.1.1.1 into a system well suited for the reanalysis purpose. In this connection, an extensive series of numerical experiments have been performed for different episodes selected from various phases of the 24-year period, to evaluate and confirm the implemented features and adaptations from both a technical and meteorological point of view. During the preparation phase, the CARRA reanalysis repository, the collection of all sources, scripts and utilities, went through gradual evolution in form of alpha-, beta-, release candidate and final releases with extensive testing and evaluation throughout.

Quality assurance has been carried out during the entire CARRA project throughout preparation and production phases (see section 2.2.11), with a goal to ensure a solid and healthy meteorological quality. As the most common means of quality monitoring, CARRA data sets are routinely validated against in-situ observation data, and inter-compared to the corresponding global ERA5 reanalysis, which is used as a quality benchmark to ensure that CARRA data maintain its added value over the corresponding data sets from coarser resolution global reanalyses.

Figure 3.1: Time series of the averaged STD and BIAS errors (Y-axis) with CARRA reanalysis (in green) validated by surface observations in CARRA domain, in comparison to those by ERA 5 (in red) for 2 m temperature (left panels) and 10 m wind speed (right panels) for the month of Sept 1997 (panel 1), Jan 2000 (panel 2), April 2007 (panel 3) and Jan 2017 (panel 4).

Figures 3.1 show the averaged standard deviation (STD) and bias error time series of 2 m temperature (T2m) and 10-m wind speed (W10m) validated against surface observations with CARRA and ERA5, for 4 month-long episodes initially selected as test episodes representing different seasons and periods of CARRA. A clear advantage is seen in general with CARRA data for both of the parameters, especially in observation fit for 2 m temperature for both STD and BIAS, demonstrating added values with high resolution representation.

Figure 3.2: Vertical profile of averaged STD and BIAS errors on key pressure levels (Y-axis) with CARRA reanalysis (in green) validated by surface observations in CARRA domain, in comparison to those by ERA 5 (in red) for Temperature (left panels) and wind speed (right panels) for the month of Sept 1997 (panel 1), Jan 2000 (panel 2), April 2007 (panel 3) and Jan 2017 (panel 4).

Figures 3.2 show the corresponding validation inter-comparison between CARRA and ERA5 reanalysis data for upper air temperature and wind for the selected 4 months, both in comparison to radiosonde measurements. It is obvious that even for upper air parameters, the regional reanalysis data sets with CARRA clearly outperform ERA5 in terms of fit to observations. Similar tendencies are found also for humidity parameters and geopotential heights to varying degrees (not shown here). This seems to be a significant contrast to common experience with LAM models in routine short range weather forecasts, in which global model often show advantages in forecasts of parameters representing primarily large scale properties. The fact that our regional CARRA reanalysis appear superior to those of the global ERA5 also for large scale properties seems to indicate that LAM models do have competitive quality compared to global models even for large scale quantities, only that in real time forecast, its skill in forecast of large scale properties have been affected negatively by the use of time-lagged global forecasts (instead of analyses) as lateral boundaries. In fact, due to constraints in requirement about forecast delivery time and availability of global forecast data, short range LAM forecasts usually need to use global data in a time lagged way in order to launch forecast with short data cutoff, causing deterioration in forecast quality. Whereas for regional reanalysis using ERA5 as lateral boundary conditions, such timeliness of delivery is no longer required, enabling LAM reanalysis to compete against global ones in a more fair and levelled ground, revealing true skills of the regional model system.

To further examine the performance of the CARRA re-analysis taking into account regional aspects, Tables 3.1 and 3.2 list the BIAS and STD for Mean Sea Level Pressure (MSLP), T2m and W10m in the reanalysis (+0 h forecast), for summer months (June, July and August) in 2015 and winter months (December, January and February) in 2014/2015. These summer and winter periods are selected from those already completed production periods in CARRA, and are considered representative of what has been seen in other periods.
For MSLP, both CARRA and ERA5 are in good agreement with each other and with observations. Furthermore, for inland and highland regions, and in particular in winter an added value is seen by CARRA thanks to smaller errors than ERA5. A possible reason for this can be the inaccuracy in reduction of pressure to MSLP (Pauley, 1998), which employs temperature that is better captured in CARRA (see below).

For T2m, there is a significantly better fit to observations with CARRA. This is valid for BIAS and STD for both seasons and almost all regions. The advantage in bias is most pronounced in regions with complex topography and/or land-sea contrasts. In contrast to a large bias in ERA5 (> 1°C in absolute value) in Greenland glacier (GCNET) stations, Norway inland stations and mountains in summer and for coast, fjords and highlands in winter, the bias error in CARRA is substantially lower. With a few exceptions (i.e. Iceland fjords and Scandinavian mountains in summer, and Greenland GCNET in winter) the T2m biases in CARRA are less than 0.5°C in absolute value. The STDs are in general larger in winter than in summer for both systems. This makes the added values from CARRA more pronounced in winter. Employing height corrections between forecast system and observation heights (not included in the results presented here) do not change the results largely (not shown). In particular during winter, any height correction is challenging due to the typical vertically stable stratification of the atmosphere. It is believed that the added value by CARRA origin from both improved resolution and increased amount of observations (in particular near-surface temperatures) entering the system.

Unlike T2m observations, which have been used in the surface assimilation, 10m wind observations over land are not used in the surface or upper air assimilation. CARRA produces higher wind speeds than ERA5, both in summer and winter and for almost all regions. With a few exceptions, this results in a significantly improved in W10m bias for all regions. However, for inland Norway and the Icelandic fjords CARRA shows a positive W10m bias larger than the underestimation of wind speed in ERA5 in absolute value. For STD, CARRA and ERA5 are quite similar, but it varies between region and season which scores better. However, taking into account the fact that STD is expected to scale with wind speed itself and hence higher in windier regions and forecasts, the results are considered overall in favor of CARRA. One shall also bear in mind that observational, interpolation and representativeness errors play an important role in explaining differences between gridded forecasts/analysis and point observations. In a case study in Køltzow et al. (2019) it was shown that ~ 25% (T2m) and ~ 40% (W10m) of forecast errors could be attributed to sub-grid scale variability. Besides bias and STD, W10m can also be evaluated by time series of monthly calculated Equitable Threat Score (ETS) for high wind events (Figure 3.2.5). For each month, the observed 90%-tile is calculated for the East and West domain separately, then this value is used as a threshold for calculating the ETS. With this, it is clearly shown that CARRA better captures high wind events better than ERA5 (for both domains and all production streams). The relative improved scores are higher for the west domain, is partly also due to lower ETS scores for ERA5 for that area.

To evaluate and compare the spatial patterns in CARRA and ERA5, the correlation between all observation sites for MSLP, T2m and W10m are computed. These are then averaged in bins by the distance between the stations and plotted as variograms (Marzban et al., 2009) in Figure 3.2.6. A rapid decorrelation with distance indicates stronger dominance of small-scale features. While for MSLP, CARRA and ERA5 are in good correspondence with observations and show a slow decorrelation as expected for a parameter representing the large scales, for both T2m and W10m, the decorrelation with distance is much higher, indicating more small scale patterns which is better captured by CARRA. However, CARRA is not able to entirely follow the spatial decorrelation in the observations. This may partly be explained by a considerable sub-grid variability even at 2,5 km horizontal resolution (e.g. Køltzow et al., 2019). The variograms presented here are for summer 2015 for the CARRA- West domain, but similar results are obtained for other periods and the CARRA-East domain (not shown).

Table 3.1: Summer verification CARRA (CA) and ERA5 (ER). Verification of MSLP, T2m and W10m for June, July and August 2015 for a selection of regions in CARRA domains. Green (red) numbers are used when CARRA is verifying better (worse) than ERA5. Colored cells are used to highlight larger differences (STD differences larger than 20% and bias difference are larger than 0.25). Asterisks denote significant differences (95th percentile confidence intervals calculated by bootstrapping is not over-laying each other).

Verification June, July and August 2015

MSLP

2m air temperature

10m wind speed

BIAS

STD

BIAS

STD

BIAS

STD

Reanalysis

CA

ER

CA

ER

CA

ER

CA

ER

CA

ER

CA

ER

Svalbard

0.0

0.0

0.5

0.6

0.0*

-0.9*

0.7*

1.8*

-0.2*

-1.1*

2.3

2.5

Greenl. coast

0.3

0.3

0.6

0.5

-0.4*

-0.8*

1.2*

2.2*

0.2*

-0.8*

2.4*

2.1*

Iceland coast

0.1*

0.0*

1.0

1.0

0.1

0.1

0.7*

1.2*

0.4*

-0.6*

2.1*

2.3*

Norway coast

0.0*

0.1*

0.9

0.9

-0.1*

-0.4*

0.8*

1.3*

-0.1*

-0.2*

2.0*

2.3*

Iceland fjords

0.0

0.0

0.5

0.5

-0.6*

-1.0*

1.1*

1.6*

0.6*

-0.1*

2.4*

2.5*

Norway fjords

0.0

0.0

0.3

0.3

-0.2*

-0.7*

0.9*

1.4*

-0.2*

-1.0*

1.9

1.9

Greenl. GCNET

NA

NA

NA

NA

-0.4*

-1.2*

1.4*

3.6*

0.7*

1.3*

2.2*

2.4*

Iceland inland

0.1

-0.1

0.5

0.5

-0.4*

-0.6*

1.0*

1.8*

-0.1*

-1.2*

2.0*

2.1*

Norway inland

0.1

0.1

0.4*

0.5*

-0.5*

-1.1*

1.2*

1.8*

0.5*

0.0*

1.6*

1.4*

Iceland higl.

0.2*

0.1*

0.8*

1.0*

-0.1*

0.8*

0.8*

1.6*

-0.5*

-2.0*

2.1*

2.5*

Scand. mount.

NA

NA

NA

NA

1.0*

2.6*

2.1*

2.2*

-1.4*

-2.9*

3.2*

2.8*

Table 3.2: Similar to Table 3.1, but verification is valid for winter 2014/2015 (December, January and February). 

Verification Decemer 2014 January, February 2015

MSLP

2m air temperature

10m wind speed

BIAS

STD

BIAS

SETD

BIAS

STD

Reanalysis:

CA

ER

CA

ER

CA

ER

CA

ER

CA

ER

CA

ER

Svalbard

0.2*

0.1*

0.7*

0.8*

0.0*

-0.1*

1.5*

2.8*

0.6*

-1.4*

3.2

3.3

Greenl. coast:

0.2*

0.6*

1.0

0.9*

0.3*

-1.2*

1.2*

2.5*

1.0*

-1.4*

3.7*

3.4*

Iceland coast:

0.2*

0.3*

1.3

1.3

0.4*

-0.2*

0.9*

2.0*

1.0*

-1.2*

3.3

3.3

Norway coast:

0.1*

0.2*

0.4*

0.5*

0.2*

-0.5*

0.9*

2.1*

0.2*

-1.0*

2.9*

3.4*

Iceland fjords

0.0*

0.2*

0.9*

0.7*

-0.5*

-1.9*

1.2*

2.3*

2.1*

-1.0*

3.7

3.8

Norway fjords

0.1

0.1

1.6

1.6

-0.1*

-2.4*

1.2*

2.9*

0.7*

-1.6*

2.8*

2.4*

Greenl. GCNET

NA

NA

NA

NA

-0.7*

1.2*

3.9*

7.0*

1.1*

1.8*

2.4

2.6

Iceland inland

-0.2

-0.2

0.8*

1.0*

-0.2*

-0.7*

1.2

2.4*

0.4*

-2.2*

3.3

3.2

Norway inland

0.1*

-0.4*

0.6*

1.1*

0.0

0.0

2.1*

3.8*

1.2*

0.3*

2.5*

2.2*

Icel. higland

0.2*

0.1*

1.1*

1.5*

-0.1*

1.1*

1.2*

2.4*

0.4*

-2.2*

3.3

3.2

Sca. mountain

NA

NA

NA

NA

0.2*

0.6*

2.7*

4.0*

-1.6*

-4.7*

4.6*

4.0*

Figure 3.2.5: Time series of Equitable Threat Score, the monthly observed 90%-tile is used as threshold, for a) CARRA-East domain (thresholds varies from ~ 8 m/s in summer to ~ 12 m/s in winter) and b) CARRA-West domain (thresholds varies from ~ 8 m/s in summer to ~ 15 m/s in winter). CARRA in red and ERA5 in blue (including 95th percentile confidence interval by bootstrapping).

Figure 3.2.6: Variograms MSLP, T2m and SW10m for June, July and August 2015 in CARRA- West. The variogram shows spatial correlation between sites in observations and forecasts. Correlation as a function of distance between observation sites is calculated, and the average over stations with similar distances are plotted.

In addition to verification statistics with synoptic network data, behavior of CARRA reanalysis during a number of high impact weather events for different seasons have been examined to investigate added values for such cases in comparison to ERA5, with in general positive conclusions. The test and verification report (Yang et al, 2020) documents these investigations.

4. Summary and conclusions

The CARRA system has been developed to perform a regional reanalysis for the European Arctic regions with a 2.5 km grid resolution. The CARRA data sets will cover the 24-years period between July 1997 and June 2021. The reanalysis is performed on two domains covering Greenland and Iceland on the west side, and Svalbard and Northern Scandinavia on the east side.

The CARRA reanalysis system uses the state of the art operational weather forecast model HARMONIE-AROME 40h1.1.1 as system baseline, with numerous adaptations, extensions and additions in order to facilitate production of a high quality and consistent reanalysis data set covering a multi-decadal period. The reanalysis uses 3-hourly cycled data assimilation with 3DVAR upper air assimilation and OI based surface assimilation, with 30h integration at 00 and 12 UTC. Extensive efforts have been devoted to test and implement a variety of features as low hanging fruits, harvesting from the recent development by the research communities behind the HARMONIE-AROME system. The reanalysis project also necessitated collection, processing and quality assurance of input data covering the whole reanalysis period, including observations, physiographic databases and lateral boundary data deemed suitable and beneficial for the purpose. In particular, collection of substantial amount of non-GTS surface observation data in the extremely data sparse arctic region, especially those over permafrost regions, ensures significant added value over other re-analyses products. In addition, some tuning or development was considered important to improve the system performance, especially to overcome known deficiencies. Among these, adaptations were made to utilize correction and improvement on databases for surface orography and physiographic features, the reprocessed remote sensing data, and non-GTS local surface observations from national weather service archives and other relevant data sources. CARRA uses the latest available hourly C3S global reanalysis (ERA5) as lateral boundaries. For the reanalysis production, an adaptation and enhancement of the HARMONIE-AROME monitoring and verification system has been made.

For configuration of the CARRA reanalysis, it was considered beneficial to inherit as much as possible of the basic model system infrastructure from the operational NWP setup at DMI and IMO for the CARRA-West domain and the AROME-Arctic setup at MET for the CARRA-East domain, in order to minimize the need for development and tuning. Harmonization has been made to unify the configurations for the two domains, including use of same spectral truncation with quadratic grid-type, a unified orographic and physiographic database (PGD), same assimilation options in data use and algorithm, same forecast model options etc. For the overlapped areas (Svalbard and Jan Mayen Islands), verification results indicated comparable results. For consistency, in view of the fact that the overlapped areas lie in the downstream of the CARRA-west domains and close to the eastern boundary, users are advised to use model data from CARRA East.

Arctic regions contains extended areas with permafrost and ice surfaces in Greenland, Iceland and Svalbard, which are not well described in the present version of HARMONIE-AROME. For this reanalysis, care has been taken to improve representation in the model of the surface properties involving glacier surfaces through the use of satellite derived glacier albedo information. This appears to be crucial to ensure the usefulness of the surface influenced variables of the data set for climate applications.

For data assimilation, the Arctic region is a particularly challenging area with rather sparse surface observation network. Extensive work has been done to select optimal method to derive structure functions suitable for CARRA, in which ERA5 data is used as lateral boundary. After careful inter-comparison, CARRA adopts the "BRAND" approach to derive, in an iterative manner, the structure functions for use in the 3DVAR data assimilation. In addition, a simple time interpolation has been made to use daily varying structure functions by interpolating between structure functions for summer and winter, so as to take into account the seasonal changes with the dominant flow patterns.
In regional NWP with HARMONIE-AROME, it has been shown to be beneficial to have a scheme to take benefit of large scale information from the host global model (in this case ERA5) providing lateral boundary conditions also inside the domain, a so-called large scale constraint. For 3DVAR data assimilation in upper air in CARRA, the large scale mixing "LSMIXBD" option which explicitly use large scale information through spectral nudging in the background term, (large scale mixing, "LSMIX"), is selected.

For the CARRA system an enhanced orographic data based on ArcDEM, a physiographic data base (PGD) including numerous corrections on coastline, glacier extent, surface cover type, soil database (with SoilGrids) and albedo over glacier and sea ice, has been used. For sea surface, reprocessed high resolution ocean and sea ice surface data from ESA-CCI and OSISAF are used as main products. In view of the sparse conventional observation network in the Arctic region, particular attention has been devoted to the collection and quality assurance of additional surface observations from multiple local sources, and in addition, a range of remote sensing data with radiances, radio occultation, scatterometer and polar AMV wind data, are included. From impact validation studies, these observation data enhancement, especially those surface observation data, appear to have a clear overall benefit to the quality of the reanalysis.

The CARRA reanalysis project is scheduled to last 4 years from September 2017 to August 2021. The first 18 months was a preparation phase, during which the reanalysis system was built up, with input data prepared? This is followed by the production phase, in which data assimilation and short range forecasts are carried out at the ECMWF High Performance Computation Facility (HPCF), with final results post-processed to feed into the Climate Data Store (CDS) for public access. At present the CARRA production is on-going, with production carried through in parallel along 5 (West domain) and 3 streams (East domain), each of them covering different time slices of the 24-year period.

The CARRA reanalysis products consists of gridded model data and assimilated observation database (ODB). These will be open to the public via Copernicus Climate Data Store (CDS). CARRA model data includes gridded analysis datasets for selected atmospheric states every 3 hours, and short range forecast every 3h up to 30h each day starting at 00 and 12 UTC. For details, see the CARRA user documentation in Nielsen et al 2020.

5. References


Batrak, Y., Kourzeneva, E., and Homleid, M., 2018: Implementation of a simple thermodynamic sea ice scheme, SICE version 1.0-38h1, within the ALADIN–HIRLAM numerical weather prediction system version 38h1. Geosci. Model Dev., doi:10.5194/gmd-11-3347-2018, 11, 3347–3368.

Bengtsson, L., U. Andrae, T. Aspelien, Y. Batrak, J. Calvo, W. de Rooy, E. Gleeson, B. Hansen-Sass, M. Homleid, M. Hortal, K. Ivarsson, G. Lenderink, S. Niemelä, K.P. Nielsen, J. Onvlee, L. Rontu, P. Samuelsson, D.S. Muñoz, A. Subias, S. Tijm, V. Toll, X. Yang, and M.Ø. Køltzow, 2017: The HARMONIE–AROME Model Configuration in the ALADIN–HIRLAM NWP System. Mon. Wea. Rev., 145, 1919–1935, doi:10.1175/MWR-D-16-0417.1

Bojarova, J. and Nils Gustafsson, N.,  2019 : Relevance of climatological background error statistics for mesoscale data assimilation, Tellus A: Dynamic Meteorology and Oceanography, 71:1, 1-22, DOI: 10.1080/16000870.2019.1615168.

Bojarova, J., X. Yang, M. Dahlbom, H. Schyberg, O. Vignes, 2019: Analysis of approaches for large scale information inclusion and decision for method. Copernicus Climate Change Service contract 2018/C3S_D322_Lot2_METNorway/SC1 deliverable report no C3S_D322_Lot2.1.1.1.

Bormann, N., C. Lupu, A. Gear, H. Lawrence, P. Weston, and S. English, 2017: Assessment of the forecast impact of surface-sensitive microwave radiances over land and sea-ice. Technical Memorandum 804, ECMWF, Reading, UK.

Box., J. E., D. van As, and K. Steffen, 2017: Greenland, Canadian and Icelandic land ice albedo grids (2000-2016), Geological Survey of Denmark and Greenland Bulletin, 38, 53-56.

Brasnett, B., 2008: The impact of satellite retrievals in a global sea-surface-temperature analysis, Q. J. R. Meteorol. Soc., 134(636), 1745–1760, doi:10.1002/qj.319.

Brun, E., David, P., Sudul, M., and Brunot, G., 1992: A numerical model to simulate snow-cover stratigraphy for operational avalanche forecasting. Journal of Glaciology, 38(128), 13-22.

CMC, 2012: GHRSST Level 4 CMC0.2deg Global Foundation Sea Surface Temperature Analysis (GDS version 2), doi:10.5067/GHCMC-4FM02.

Danielson, J.J. and Gesch, D.B., 2011. Global multi-resolution terrain elevation data 2010 (GMTED2010) (No. 2011-1073), doi:10.3133/ofr2011107. US Geological Survey, Reston, VA 20192, USA.

Donlon, C. J., Martin, M., Stark, J., Roberts-Jones, J., Fiedler, E. and Wimmer, W., 2012: The Operational Sea Surface Temperature and Sea Ice Analysis (OSTIA) system, Remote Sens. Environ., 116, 140–158, doi:10.1016/j.rse.2010.10.017.

Douville, H., Royer, J.-F., and Mahfouf, J.-F., 1995: A new snow parameterization for the French community climate model. Part I: Validation in stand-alone experiments, Clim. Dynam., doi:10.1007/BF00208760, 12, 21–35.

Faroux, S., A. T. Kaptué Tchuenté, J.-L. Roujean,  V. Masson, E. Martin, and P. Le Moigne, 2013: ECOCLIMAP-II/Europe: A twofold database of ecosystems and surface parameters at 1 km resolution based on satellite information Ocean Science for use in land surface, meteorological and climate models. Geosci. Model Dev., doi:10.5194/gmd-6-563-2013, 6, 563–582.

Gleisner, H., Lauritsen, K. B., Nielsen, J. K., and Syndergaard, S.: Evaluation of the 15-year ROM SAF monthly mean GPS radio occultation climate data record, Atmos. Meas. Tech. Discuss., https://doi.org/10.5194/amt-2019-417, in review, 2019.

https://www.atmos-meas-tech-discuss.net/amt-2019-417/

Haklay, M. and Weber, P., 2008. OpenStreetmap: User-generated street maps. IEEE Pervasive Computing, doi:10.1109/MPRV.2008.80, 7(4), 12-18.

Haimberger, L., C. Tavolato. and S. Sperka (2012) Homogenization of the global radiosonde temperature dataset through combined comparison with reanalysis background series and neighboring stations. J. Climate 25, 8108-8131

Haimberger, L. et. al. (2017) Bias adjustments for global radiosonde data. ERA-CLIM2 deliverable 4.2http://www.era-clim2.eu/products

Hengl, T., de Jesus, J.M., Heuvelink, G.B., Gonzalez, M.R., Kilibarda, M., Blagotić, A., Shangguan, W., Wright, M.N., Geng, X., Bauer-Marschallinger, B. and Guevara, M.A., 2017. SoilGrids250m: Global gridded soil information based on machine learning. PLoS one, doi:10.1371/journal.pone.0169748, 12(2), p.e0169748.

Hirano, A., Welch, R. and Lang, H., 2003. Mapping from ASTER stereo image data: DEM validation and accuracy assessment. ISPRS Journal of Photogrammetry and remote sensing, doi:10.1016/S0924-2716(02)00164-8, 57(5-6), 356-370.

Homleid, Mariken and Mari Anne Killie, 2013: HARMONIE snow analysis experiments with additional observations, MET report 06/2013, available on https://www.met.no/publikasjoner/met-report/met-report-2013.

Howat, I.M., Negrete, A., and Smith, B.E., 2014: The Greenland Ice Mapping Project (GIMP) land classification and surface elevation data sets. The Cryosphere, doi:10.5194/tc-8-1509-2014, 8(4), 1509-1518.

Høyer, J. L., 2016: Product user Manual For Baltic Sea SST Level 4 Product SST_BAL_SST_L4_NRT_OBSERVATIONS_010_007_b, CMEMS user manual: [online] Available from: http://cmems-resources.cls.fr/documents/PUM/CMEMS-OSI-PUM-010-007-b.pdf (Accessed 10 September 2017).

Høyer, J. L. and Karagali, I., 2016: Sea Surface Temperature Climate Data Record for the North Sea and Baltic Sea, J. Clim., 29(7), 2529–2541, doi:10.1175/JCLI-D-15-0663.1.

Karbou, F., E. Gérard, and F. Rabier, 2010a: Global 4D-Var assimilation and forecast experiments using AMSU observations over land. Part-I: Impact of various land surface emissivity parameterizations, Weather and Forecasting, 25, 5-19.

Karbou, F., F. Rabier, J-P. Lafore, J-L. Redelsperger, O. Bock, 2010b: Global 4D-Var assimilation and forecast experiments using AMSU observations over land. Part II: Impact of assimilating surface sensitive channels on the African Monsoon during AMMA, Weather and Forecasting, 25, 20-36.

Killie, Mari Anne, Steinar Eastwood, Atle M. Sørensen, 2018: Improvement and Validation of the Optical Component Used for Snow Mapping in CryoClim, Sentinel4CryoClim Phase 2 Project Report. METreport no 12/2018. Available on https://www.met.no/publikasjoner/met-report/met-report-2018.

Kokhanovsky, A., Lamare, M., Danne, O., Dumont, M., Brockmann, C., Picard, G., Arnaud, L., Favier, V., Jourdain, B., Lemeur, E., Di Mauro, B., Aoki, T., Niwano, M., Rozanov, V., Korkin, S., Kipfstuhl, S., Freitag, J., Hoerhold, M., Zuh, A., Vladimirova, D., Faber, A.-K., Steen-Larsen, H.-C., Wahl, S., Andresen, J., Vandecrux, B., van As, D., Mankoff, K., Kern, M., Zege, E., and Box, J., 2019: Retrieval of snow properties from the Sentinel-3 Ocean and Land Colour Instrument, Remote Sens, doi:10.3390/rs11192280, 11(19), 2280.

Køltzow, M., B. Casati, E. Bazile, T. Haiden, and T. Valkonen, 2019: An NWP Model Intercomparison of Surface Weather Parameters in the European Arctic during the Year of Polar Prediction Special Observing Period Northern Hemisphere 1. Wea. Forecasting, 34, 959–983, https://doi.org/10.1175/WAF-D-19-0003.1

Krieger, G., Moreira, A., Fiedler, H., Hajnsek, I., Werner, M., Younis, M. and Zink, M., 2007. TanDEM-X: A satellite formation for high-resolution SAR interferometry. IEEE Transactions on Geoscience and Remote Sensing, doi:10.1109/TGRS.2007.900693, 45(11), 3317-3341.

Langen, P.L., Fausto, R.S., Vandecrux, B., Mottram, R.H., and Box, J.E., 2017: Liquid water flow and retention on the Greenland ice sheet in the regional climate model HIRHAM5: Local and large-scale impacts. Frontiers in Earth Science, doi:10.3389/feart.2016.00110, 4, 110.

Lavergne, T., Sørensen, A. M., Kern, S., Tonboe, R., Notz, D., Aaboe, S., Bell, L., Dybkjær, G., Eastwood, S., Gabarro, C., Heygster, G., Killie, M. A., Brandt Kreiner, M., Lavelle, J., Saldo, R., Sandven, S. and Pedersen, L. T., 2019: Version 2 of the EUMETSAT OSI SAF and ESA CCI sea-ice concentration climate data records, The Cryosphere, 13(1), 49–78, doi:10.5194/tc-13-49-2019.

Marzban, C., S. Sandgathe, H. Lyons, and N. Lederer, 2009: Three spatial verification techniques: Cluster analysis, variogram, and optical flow. Wea. Forecasting, 24, 1457–1471, https://doi.org/10.1175/2009WAF2222261.1.

Masson, V., Champeaux, J.-L., Chauvin, F., Meriguet, C., and Lacaze, R., 2003: A Global Database of Land Surface Parameters at 1-km Resolution in Meteorological and Climate Models, J. Climate, doi:10.1175/1520-0442-16.9.1261, 16, 1261–1282.

Masson, Valéry, Le Moigne, Patrick, Martin, Eric, Faroux, Stéphanie, Alias, A, Alkama, Ramdane, Belamari, Sophie, Barbu, Alina, Boone, Aaron, Bouyssel, F., Brousseau, Pierre, Brun, E., Calvet, Jean-Christophe, Carrer, Dominique, Decharme, B., Delire, Christine, Donier, S., Essaouini, K., Gibelin, A., Voldoire, A., 2013: The SURFEXv7.2 land and ocean surface platform for coupled or offline simulation of earth surface variables and fluxes. Geosci. Model Dev., doi:10.5194/gmd-6-929-2013, 6, 929–960.

Merchant, C. J., Embury, O., Roberts-Jones, J., Fiedler, E., Bulgin, C. E., Corlett, G. K., Good, S., McLaren., A., Rayner, N. and Donlon, C., 2016: ESA Sea Surface Temperature Climate Change Initiative (ESA SST CCI): Analysis long term product version 1.1, , doi:10.5285/2262690A-B588-4704-B459-39E05527B59A.

Mottram, R., Nielsen, K. P., Gleeson, E., and Yang, X., 2017: Modelling Glaciers in the HARMONIE-AROME NWP model. Adv. Sci. Res., doi:10.5194/asr-14-323-2017, 14, 323–334.

Müller, M., M. Homleid, K. Ivarsson, M.A. Køltzow, M. Lindskog, K.H. Midtbø, U. Andrae, T. Aspelien, L. Berggren, D. Bjørge, P. Dahlgren, J. Kristiansen, R. Randriamampianina, M. Ridal, and O. Vignes, 2017a: AROME-MetCoOp: A Nordic Convective-Scale Operational Weather Prediction Model. Wea. Forecasting, 32, 609–627, https://doi.org/10.1175/WAF-D-16-0099.1

Müller, M., Y. Batrak, J. Kristiansen, M.A. Køltzow, G. Noer, and A. Korosov, 2017b: Characteristics of a Convective-Scale Weather Forecasting System for the European Arctic. Mon. Wea. Rev., 145, 4771–4787, https://doi.org/10.1175/MWR-D-17-0194.1

Nachtergaele, F., van Velthuizen, H., Verelst, L., Batjes, N.H., Dijkshoorn, K., van Engelen, V.W.P., Fischer, G., Jones, A. and Montanarela, L., 2010. The harmonized world soil database. In Proceedings of the 19th World Congress of Soil Science, Soil Solutions for a Changing World, Brisbane, Australia, 1-6 August 2010 (pp. 34-37).

Nielsen, K. P.,  Yang, X., Agersten, S., Støylen, E. and Schyberg, H., 2019: C3S Arctic regional reanalysis – Data User Guide – Initial version. C3S deliverable report no C3S_D311_Lot2.4.1.1–201910-DataUserGuide-v2. Available at Copernicus Arctic Regional Reanalysis (CARRA): Data User Guide

Noer, G., Ø. Saetra, T. Lien, and Y. Gusdal, 2011: A climatological study of polar lows in the Nordic Seas. Quart. J. Roy. Meteor. Soc., 137, 1762–1772, doi:https://doi.org/10.1002/qj.846.

Norwegian Polar Institute, 2014: Kartdata Svalbard 1:100 000 (S100 Kartdata) / Map Data. Norwegian Polar Institute. doi:10.21334/npolar.2014.645336c7.

OSI SAF (2015): SeaWinds L2 25 km winds data record release 1 - QuikSCAT, EUMETSAT SAF on Ocean and Sea Ice, DOI: 10.15770/EUM_SAF_OSI_0002. http://dx.doi.org/10.15770/EUM_SAF_OSI_0002

OSI SAF (2016): ASCAT L2 12.5 km winds data record release 1 - Metop, EUMETSAT SAF on Ocean and Sea Ice, DOI: 10.15770/EUM_SAF_OSI_0007. http://dx.doi.org/10.15770/EUM_SAF_OSI_0007

OSI SAF (2017): ERS Scatterometer L2 25 km winds data record release 1 - ERS, EUMETSAT SAF on Ocean and Sea Ice, DOI: 10.15770/EUM_SAF_OSI_0009. http://dx.doi.org/10.15770/EUM_SAF_OSI_0009

OSI SAF (2018): OSCAT L2 25 km winds data record release 1 - OceanSat, EUMETSAT SAF on Ocean and Sea Ice, DOI: 10.15770/EUM_SAF_OSI_0010. http://dx.doi.org/10.15770/EUM_SAF_OSI_0010

OSI SAF (2019): ASCAT Coastal Winds at 12.5 km Swath Grid - Metop, EUMETSAT SAF on Ocean and Sea Ice, Collection ID: EO:EUM:DAT:METOP:OSI-104,  https://navigator.eumetsat.int/product/EO:EUM:DAT:METOP:OSI-104

Pauley, P. M., 1998: An example of uncertainty in sea level pressure reduction. Wea. Forecasting, 13, 833–850, https://doi.org/10.1175/1520-0434(1998)013<0833:AEOUIS>2.0.CO;2.

Porter, C., Morin, P., Howat, I., Noh, M.-J., Bates, B., Peterman, K., Keesey, S., Schlenk, M., Gardiner, J., Tomko, K., Willis, M., Kelleher, C., Cloutier, M., Husby, E., Foga, S., Nakamura, H., Platson, M., Wethington, M. Jr., Williamson, C., Bauer, G., Enos, J., Arnold, G., Kramer, W., Becker, P., Doshi, A., D’Souza, C., Cummens, P., Laurier, F., Bojesen, M., 2018: ArcticDEM, doi:10.7910/DVN/OHHUKH, Harvard Dataverse, V1, (accessed 2018-08-15).

Randriamampianina, R.; Iversen, T.; Storto, A., 2011: Exploring the assimilation of IASI radiances in forecasting polar lows. Q. J. R. Meteor. Soc., doi:10.1002/qj.838, 137, 1700–1715.

Randriamampianina, R.; Schyberg, H.; Mile, M. 2019: Observing System Experiments with an Arctic Mesoscale Numerical Weather Prediction Model. Remote Sens., doi:10.3390/rs11080981, 11(8), 981.

RGI Consortium, 2015: Randolph Glacier Inventory – A Dataset of Global Glacier Outlines: Version 5.0: Technical Report, Global Land Ice Measurements from Space, Colorado, USA. Digital Media. doi:10.7265/N5-RGI-50.

Ryan, J. C., Hubbard, A., Stibal, M., and Box, J. E., 2016: Attribution of Greenland’s ablating ice surfaces on ice sheet albedo using unmanned aerial systems, The Cryosphere Discuss., doi:10.5194/tc-2016-204.

Samuelsson, P., Homleid, M., Aspelien, T., Andrae, U., 2018: Two patches in cy40h HARMONIE-AROME and modified tree height and snow roughness length for the MetCoOp domain, Aladin-Hirlam Newsletter, 10, 107-115.

Simard, M., Pinto, N., Fisher, J. B., Baccini, A., 2011: Mapping forest canopy height globally with spaceborne lidar. J. Geophys. Res.: Biogeosci., doi:10.1029/2011JG001708, 116 (G4).

Solberg, Rune; Rudjord, Øystein; Salberg, Arnt Børre; Killie, Mari Anne; Eastwood, Steinar; Breivik, Lars-Anders, 2017: Advancement of global snow mapping in CryoClim, Sentinel4CryoClim Phase 1, Deliverables 1-6. Norsk Regnesentral, Oslo. NR-notat SAMBA/10/17. pp 80.

Spengler, T., C. Claud, and G. Heinemann, 2017: Polar Low Workshop summary. Bull. Amer. Meteor. Soc., 98, ES139–ES142, https://doi.org/10.1175/BAMS-D-16-0207.1.

SURFEX, 2016: Version documentation of SURFEX v7.3, https://www.umr-cnrm.fr/surfex (accessed 2019-09-26).

Taillefer, F.: CANARI - Technical Documentation - Based on ARPEGE cycle CY25T1 (AL25T1for ALADIN). Available at http://www.cnrm.meteo.fr/aladin/, 2002.

Tonboe, R. T., Eastwood, S., Lavergne, T., Sørensen, A. M., Rathmann, N., Dybkjær, G., Pedersen, L. T., Høyer, J. L. and Kern, S., 2016: The EUMETSAT sea ice concentration climate data record, The Cryosphere, 10(5), 2275–2290, doi:10.5194/tc-10-2275-2016.

Toudal Pedersen, L., Dybkjær, G., Eastwood, S., Heygster, G., Ivanova, N., Kern, S., Lavergne, T., Saldo, R., Sandven, S., Sørensen, A. and Tonboe, R., 2017: ESA Sea Ice Climate Change Initiative (Sea_Ice_cci): Sea Ice Concentration climate data record from the AMSR-E and AMSR-2 instruments at 25km grid spacing, version 2.0, , doi:10.5285/c61bfe88-873b-44d8-9b0e-6a0ee884ad95.

Valkonen, T., H. Schyberg, and J. Figa-Saldana, 2016: Assimilating Advanced Scatterometer winds in a high-resolution limited area model over northern Europe. IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens., PP (99), 1–12, doi:10.1109/JSTARS.2016.2602889.

Vionnet, V., Brun, E., Morin, S., Boone, A., Faroux, S., Le Moigne, P., Martin, E., and Willemet, J.M., 2012: The detailed snowpack scheme Crocus and its implementation in SURFEX v7. 2. Geosci. Model Dev., doi:10.5194/gmd-5-773-2012, 5, 773–791.

Yang, W., Tan, B., Huang, D., Rautiainen, M., Shabanov, N.V., Wang, Y., Privette, J.L., Huemmrich, K.F., Fensholt, R., Sandholt, I., and Weiss, M., 2006. MODIS leaf area index products: From validation to algorithm improvement. IEEE Transactions on Geoscience and Remote Sensing, 44(7), 1885-1898.

Yang, X. et al, 2020: CARRA report on test and evaluation. C3S deliverable report in preparation. Available at Copernicus Arctic Regional Reanalysis (CARRA): Complete test and verification report on fully configured reanalysis and monitoring system

Yang, X., Palmason, B., Sattler, K., Thorsteinsson, S., Amstrup, B., Dahlbom, M, Hansen Sass, B., Pagh Nielsen, K., Petersen, G. N, 2018:  IGB, the Upgrade to the Joint Operational HARMONIE by DMI and IMO in 2018, ALADIN-HIRLAM Newsletter, No. 11, 93-96, 2018.

Yuan, H., Dai, Y., Xiao, Z., Ji, D. and Shangguan, W., 2011. Reprocessing the MODIS Leaf Area Index products for land surface and climate modelling. Remote Sensing of Environment, doi:10.1016/j.rse.2011.01.001, 115(5), 1171-1187.

6. Appendix: Input data for CARRA

1 Orography and Physiographic Database (PGD)

Orography: Baseline dataset with GMTED2010 in ca 300m resolution + ArcticDem v7.0 over Greenland (2 m) with corrections on gaps using dataset of ArcticDem v6.0 , AsterDem and TanDem X + consistency correction about coast lines in Greenland with land-sea mask data from Danish Map Supply.

PGD data: Harmonie 40h1.1.1 PGD data as baseline (ECOCLIMAP-II, 1 km), with corrections on glacier extent, land-sea masks, new surface cover types for Icelandic surface with volcanic origin, the updated ECOCLIMAP lake database, the Leaf Area Index (LAI) climatology based on MODIS MCD15A2H C6 multi-year mean, tree height correction for Sweden, Norway and Finland.

Soil database (sand/clay): SoilGrids except for Iceland.
Daily gap-filled 500 m resolution albedo data, MOD10A1 C6, for period 2000 to 2019, provided by the Geological Survey of Denmark and Greenland (GEUS) (Box et al. 2017).

2 Lateral boundary

ERA5 4DVAR hourly analyses for key atmospheric parameters (wind, temperature, geopotential height, specific humidity), 31 km resolution.

For deriving structure functions, the ERA5 EDA ensemble data for the first 10 members are used as lateral boundary. 62 km grid resolution.

3 Synoptic and other surface observation

Baseline data from ERA5, plus non-GTS data with

  • national weather services in Denmark, Iceland, Finland, Norway and Sweden
  • surface observation from Greenland ASIAQ, GCnet and PROMICE
  • Quality controlled surface station data over Greenland by DMI
  • Radiosonde data from Summit station in Greenland between 2011 and 2016 (DMI)

4 Ocean and sea ice

  • Specially made high resolution data sets for Sea Surface Temperature (SST) and Sea Ice concentration (SIC), derived from several reprocessed satellite products with both regional (the Baltic Sea) and global coverage. The satellite products are gap-filled Level 4 (L4) interpolated fields, and interpolated onto a 0.05° regular latitude-longitude grid from 56N to 86N and 110W to 90E, covering both model domains in CARRA.

Two different SIC products have been used: the ESA CCI SIC product (SICCI) and the EUMETSAT OSISAF SIC product, OSI-450 , with the former used whenever available (/- 5 days), else the latter is used to fill the gap (/- 5 days).

The baseline SST product (used outside the Baltic region) is the ESA SST CCI Analysis Long Term Product (Merchant et al., 2016), which covers the period 1991 to 2010. From 2011 and onwards the Operational Sea Surface Temperature and Sea Ice Analysis (OSTIA; Donlon et al., 2012) is used for SST input outside the Baltic region.

5 Satellite snow cover data

  • satellite snow extent by the CryoClim project, 5 km resolution based on historical AVHRR GAC data (A2 FCDR)

6 Upper air observations

  • ERA5 data as baseline including those with microwave and infrared radiance, polar AMV
  • Reprocessed scatterometer data by EUMETSAT/OSISAF
  • ROM-SAF reprocessed RO bending angle


This document has been produced in the context of the Copernicus Climate Change Service (C3S).

The activities leading to these results have been contracted by the European Centre for Medium-Range Weather Forecasts, operator of C3S on behalf of the European Union (Delegation Agreement signed on 11/11/2014 and Contribution Agreement signed on 22/07/2021). All information in this document is provided "as is" and no guarantee or warranty is given that the information is fit for any particular purpose.

The users thereof use the information at their sole risk and liability. For the avoidance of all doubt , the European Commission and the European Centre for Medium - Range Weather Forecasts have no liability in respect of this document, which is merely representing the author's view

7. Related articles