...
Reanalysis is a method for reconstructing past atmospheric states by integrating historical observations with a weather forecasting model. The Copernicus Arctic Regional Reanalysis (CARRA) is a high-resolution climate data product that assimilates an extensive time series of observations into the HARMONIE model and the 3D-Var data assimilation system to provide the most accurate estimate of the atmospheric state. An important requirement for the CARRA reanalysis is the computation of potential ensemble uncertainties for essential climate variables. Numerical models inherently contain various uncertainties and are often run in ensemble mode to enhance forecast accuracy and evaluate uncertainty. During CARRA1, the CARRA team developed an approach that utilizes a limited number of high-resolution ensembles generated over several short time intervals, in conjunction with the derivation of background error statistics (Bojarova et al., 2020). This methodology offers a relatively straightforward estimation of uncertainty for key prognostic variables, employing a scaling method that compares ensemble spread with observation error variances at observation locations. The uncertainty estimates provided are static; however, they do vary with height, season, and between the CARRA-West and CARRA-East domains. The tables containing the information for what fields uncertainties are provided as well as the data themselves can be obtained from Copernicus Arctic Regional Reanalysis (CARRA): Data User Guide and the Known issues and uncertainty information documentation pages (https://confluence.ecmwf.int/display/CKB/Copernicus + Arctic + Regional + Reanalysis +%28CARRA%29%3A+Data+User+Guide#CopernicusArcticRegionalReanalysis (CARRA): DataUserGuide-Whataretheuncertaintiesofthedatafields?
and https://confluence.ecmwf.int/display/CKB/Copernicus+Arctic+Regional+Reanalysis+%28CARRA%29%3A+known+issues+and+uncertainty+information#CopernicusArcticRegionalReanalysis(CARRA):knownissuesanduncertaintyinformation-Uncertaintyinformation
Data User Guide-What are the uncertainties of the data fields? and Copernicus Arctic Regional Reanalysis (CARRA): known issues and uncertainty information-Uncertainty information).
For CARRA2, we face a challenge regarding the requirement for a high-resolution, extended reanalysis dataset in terms of both domain and potentially time range, while simultaneously providing uncertainty information when an ensemble system is computationally not feasible. Our proposed approach considers the experiences gained during CARRA1 and draws inspiration from the generation of time-varying uncertainty information as described in Olesen et al. (2013), where deterministic regional-scale information is supplemented with uncertainty estimation using global ensembles for projecting regional climate change. In this work, we again utilize the ensemble dataset generated in connection with the derivation of background error statistics, which is driven by the ERA5-EDA with ensemble components. This dataset is employed to produce a coincident ensemble in the limited-area model by introducing perturbations using the "BRAND" field perturbation approach (refer to section 2.2.2 of Yang et al., 2021, and the CARRA1 system documentation for details on the BRAND approach). The objective is to establish an empirical relationship, in the form of nonlinear regression or a machine learning approach, to predict the high-resolution regional spread of Essential Climate Variables (ECVs) (a scalar for each ECV) using both the high-resolution deterministic CARRA2 and the low-resolution ERA5 reanalysis EDA components as predictors (the full fields, employing an implicit multivariate approach). By taking the high-resolution CARRA2 spread as a proxy for uncertainty, we will be able to predict the CARRA2 uncertainty using the information available during the reanalysis production, even when a corresponding high-resolution ensemble is not available.The statistical model will be trained on this collected dataset to predict the high-resolution spread in the space of ECVs (as a proxy for uncertainty), which will be used to estimate the reanalysis uncertainty for the entire reanalysis period (including periods outside the time slices used for background error statistics derivation). This method is expected to capture variations in uncertainties with height, the presence of orography, observation network density, and other factors. The method involves computing the uncertainty in model space, where we will first calculate the ensemble uncertainty in terms of standard deviation (SD), and ensemble mean from CARRA2 and ERA5-EDA. This approach also has the potential to provide weather situation-dependent uncertainties and will offer a more detailed description of the actual variations in uncertainty across space and time than was achievable with the method used in CARRA1.
...
To train the machine learning model, one of our main data sources is the ECMWF ERA5 reanalysis (Hersbach et al., 2020). The ERA5 reanalysis datasets are generated by continuously integrating observations using 4D-Var data assimilation with the Integrated Forecasting System (IFS) model cycle CY41R2 on 31 km horizontal resolution and using 137 hybrid sigma/pressure model levels in vertical. The ERA5 dataset includes a ten-member ensemble (EDA) that has a lower spatial and temporal resolution (approximately 60 km horizontally and 3-hour temporally) compared to the original ERA5 product (around 30 km horizontally and 1-hour temporal resolution). This lower resolution analysis dataset is utilized for estimating uncertainty in ERA5. More details on the ERA5-EDA component can be found in ERA5: Data Documentation (accessible via https://confluence.ecmwf.int/display/CKB/ERA5%3A+data+ERA5: data documentation) that offers a comprehensive overview of the various products and lists all available geophysical parameters. The ERA5-EDA can be effectively downloaded directly from MARS or through the Copernicus Climate Data Store (CDS).
...
Figure 4: The ensemble spread of 2-meter temperature (in Kelvin) for the 10 ensemble members of the ERA5-EDA reanalysis data (left panel) and the CARRA2 ensemble members (right panel), valid on 20 January 2022 for all four analysis cycles.
Table 1: List of the input near surface parameters for uncertainty estimation in model space using the machine learning method for both ERA5 and CARRA2 ensembles.
Variables | Level | CARRA2 | ERA5 |
2m Temperature (in Kelvin) | Near Surface | Y =2869; X = 2869 | Y = 114; X = 130 |
Zonal Wind (10 m, u- in m/sec) | Near Surface | Y =2869; X = 2869 | Y = 114; X = 130 |
Meridional Wind (10 m, v- in m/sec) | Near Surface | Y =2869; X = 2869 | Y = 114; X = 130 |
Surface Pressure (Pa) | Surface | Y =2869; X = 2869 | Y = 114; X = 130 |
Table 1 displays the input parameters near the surface for uncertainty estimation in model space using the ML approach for both the ERA5 ensemble and CARRA2 ensemble. Precipitation is excluded as a variable in the diffusion-based ML method because the approach requires gridded spread data among ensemble members. When there is no precipitation or a null value for any ensemble member, the spread becomes excessively high and unrealistic. As a result, the ML model is likely to perform poorly in most cases within the CARRA2 domain.
...
Bojarova, J. et al. (2020). Uncertainty estimation method. C3S deliverable report C3S_D322_Lot2.1.1.2-202002. https://confluence.ecmwf. int/display/CKB/Copernicus + Arctic + Regional +Reanalysis+%28CARRA%29%3A+Uncertainty+estimation+Reanalysis (CARRA): Uncertainty estimation method
Hersbach, H., Bell, B., Berrisford, P., et al. (2020). The ERA5 global reanalysis. Quarterly Journal of the Royal Meteorological Society, 146, 1999–2049.https://doi.org/10.1002/qj.3803
...
Yang, Xiaohua et al. (2020). C3S Arctic regional reanalysis – Full system documentation. C3S deliverable report C3S_D311_Lot2.1.2.2–201910. https://confluence.ecmwf.int/display/CKB/Copernicus + Arctic + Regional +Reanalysis+%28CARRA%29%3A+Full+system+Reanalysis (CARRA): Full system documentation
Appendix 1: Supplementary Figures
...
During the DDPM-ML training process, periodic checkpoints (model states) are saved to monitor model performance and track the gradual improvement in output quality. This strategy helps to identify the optimal training duration while ensuring efficient use of computational resources. At each training step, the weighted mean squared error (MSE) is calculated between the model's predicted noise and the true Gaussian noise added at that step (Figure S4). The MSE curve shows a sharp decrease within the first 1,000 steps, after which the error stabilizes and gradually converges. This behavior indicates that the model effectively learns to minimize noise prediction errors as training progresses. However, to produce high-quality outputs (samples), the model must be trained for up to 20,000 steps. In practice, the most accurate and stable results were typically achieved after around 10,000 steps. This extended training requirement is primarily due to the use of high-resolution, large-domain CARRA2 data, which demand longer training for proper convergence and accurate reconstruction. To preserve model progress, checkpoints are saved at 10,000, 12,000, 14,000 steps, and beyond. Each checkpoint file (e.g., model010000.pt) stores the trained model weights and biases, along with diffusion process hyperparameters used in both the forward and reverse processes. These files ensure reproducibility and allow further fine-tuning or analysis if required.
...
Table 1: List of Python Scripts and Job Files
Category / Folder | File Name | Description / Purpose |
Main Scripts | Train_Main.py | Main script for training the diffusion model. |
Evaluation Scripts | evaluate.py | Script for evaluating model performance across datasets (for UQ and each variable) |
evaluate_FIELD.py | Field-specific evaluation script, likely used for variable-based assessment (t2m, sp, u10, v10…). | |
Job Submission Scripts | Run_Training.job | Job submission script for launching training in ATOS. |
Run_evaluation.job | Job submission script for running evaluation tasks. | |
Source Folder: src_diffusion/ | diffusion_dist.py | Handles distributed training setup for parallel computation. |
diffusion_fp16.py | Manages mixed-precision (FP16) computation for efficiency. | |
diffusion_gaussian.py | Implements Gaussian diffusion processes and noise modeling. | |
diffusion_train.py | Core training logic for the diffusion model. | |
image_datasets.py | Dataset loader and pre-processing utilities for image inputs. | |
logger.py | Logging utility for training and evaluation progress. | |
losses.py | Defines and computes loss functions used during training. | |
nn.py | Neural network components and layer definitions. | |
resample.py | Implements resampling strategies in the diffusion process. | |
respace.py | Defines timestep spacing or schedule adjustment functions. | |
unet.py | Contains the UNet model architecture used for diffusion and super-resolution tasks. |
The Python script Train_Main.py will be executed via the batch job Run_Training.job to perform the training process. The model configuration parameters are detailed in Table 2. To monitor error statistics, it is necessary to extract values from the log file, which is located in the same directory as the output files.
Table 2: Summary of the model configuration parameters used for diffusion-based training and generation (evaluation).
Parameter | Description | Value / Setting |
--diffusion_steps | Number of diffusion and denoising iterations each image undergoes during training. | 4000 |
--image_size | Maximum image dimension used during training. | 256 |
--noise_schedule | Type of noise schedule applied; defines how noise levels change during diffusion. Can be modified during tuning. | linear |
--lr | Learning rate used for model optimization. | 1e-4 |
--batch_size | Number of images processed in each training batch. | 8 |
--microbatch | Subdivision of batch for memory efficiency; typically based on available GPU memory. | 4 |
--class_cond | Enables supervised learning by conditioning on class labels. | True |
--steps | Total number of training iterations. | 20,000 |
— | Model checkpoint saving frequency. | Every 2000 steps |
— | Empirical performance note: model accuracy tends to improve notably after this point (varies with dataset size). | ~10000 steps |
It is important to highlight that the ERA5-EDA dataset possesses a coarser spatial resolution, characterized by grid dimensions of 130 by 114, whereas the CARRA2 dataset features a significantly finer resolution of (2880 x 2880) grid points. To reconcile these differences and generate outputs at the CARRA2 resolution, a specific approach was integrated within the training loop to facilitate appropriate sampling within the Super-Resolution Model, specifically the U-Net Model. This model encompasses essential components for both the training and deployment of diffusion-based super-resolution frameworks conditioned on low-resolution ERA5-EDA input maps. Key features of the model include advanced sampling techniques tailored for diffusion training, U-Net inspired architectural designs, incorporation of residual and attention mechanisms, cross-attention conditioning, as well as optional mixed-precision training capabilities to enhance computational efficiency.
...
Table 3: The key files generated during model training, including checkpoint files, log outputs, and progress tracking data.
File Name | File Type | Description |
log.txt | Log File | Contains training logs, including losses, metrics, and system information. |
progress.csv | Progress File | Records performance metrics over training steps for plotting or analysis. |
Model Checkpoint | Initial model state before training begins. | |
Model Checkpoint | Saved after 2,000 training steps. | |
Model Checkpoint | Saved after 4,000 training steps. | |
Model Checkpoint | Saved after 6,000 training steps. | |
Model Checkpoint | Saved after 8,000 training steps. | |
Model Checkpoint | Saved after 10,000 training steps. | |
Model Checkpoint | Saved after 12,000 training steps. | |
Model Checkpoint | Saved after 14,000 training steps. | |
Model Checkpoint | Saved after 16,000 training steps. | |
Model Checkpoint | Saved after 18,000 training steps. | |
Model Checkpoint | Final trained model after 20,000 steps. |
d. Diffusion Sampling and Evaluation Overview
...
Table 4: UQ Evaluation Output Files
Category | Files |
Evaluation Type | Uncertainty Quantification |
Logs | rank_0.log, rank_1.log, rank_2.log, rank_3.log |
Model Outputs (SD Evaluations) | UQ_ckpt_model012000.pt.png |
Main Output Image | UQ.png |
Table 5. Filed Evaluation Output Files
Category | Files |
Evaluation Type | Uncertainty Quantification |
Logs | rank_0.log, rank_1.log, rank_2.log, rank_3.log |
Model Outputs (SD Evaluations) | FIELD_ckpt_model012000.pt.png |
Main Output Image | TARGET_CARRA2.png |
Figure 2.a.:Uncertainty estimation associated with higher-resolution uncertainty quantification, as depicted in the file UQ.png. Panel b) presents the corresponding 2-meter temperature (k) field for a single UTC.More information is available in the (C3S2_D361a.1.4.1_UncertaintyEstimation_v1).
In summary, the final result of the higher resolution uncertainty quantification is presented in the file UQ.png. The file TARGET_CARRA2.png serves as a reference for comparison or evaluation against the actual field data.
...

