Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Info
iconfalse
titleTable of Contents

Table of Contents
maxLevel5

History of modifications 

Expand
titleClick here to expand the history of modifications


Issue

Date

Description of modification

Author

v1.016/06/2025Final versionC3S





List of datasets covered by this document 

Expand
titleClick here to expand the list of datasets covered by this document


Deliverable ID

Product title

Product type (CDR, ICDR)

C3S Version Number

Public Version Number

Delivery date


Climate and energy related variables from the Pan-European Climate Database derived from reanalysis and climate projections v4.2CDRv1.0v1.016/06/2025


Acronyms and abbreviations
Anchor
Acronyms
Acronyms

Expand
titleClick here to expand the list of acronyms and abbreviations


Acronym/abbreviation

Definition

AMAnnual Maxima
AOIAngle Of Incidence
APIApplication Programming Interface
AR6Sixth Assessment Report
ASCIIAmerican Standard Code for Information Interchange
AWCMAWI-CM-1-1-MR
BCCSBCC-CSM2-MR 
BHIBeam Horizontal Irradiance
BIASData that has been bias-adjusted
C3SCopernicus Climate Change Service
CDFtCumulative Distribution Function transfer
CDOClimate Data Operators
CDSClimate Data Store
CITYLevel of data aggregation corresponding to specific city coordinates
CMIP6Coupled Model Intercomparison Project (sixth phase)
CMR5CMCC-CM2-SR5
CSPConcentrated Solar Power
DHIDiffuse Horizontal Irradiance
DMPData Management Plan
DNIDirect Normal Irradiance
DTUTechnical University of Denmark
ECE3EC-Earth3
ENTSO-EEuropean Network of Transmission System Operators for Electricity
ERAAEuropean Resource Adequacy Assessment
ESFGEarth System Grid Federation
ESRIEnvironmental Systems Research Institute
GCMGlobal Climate Model
GHIGlobal Horizonta Irradiance corresponding to the Surface solar radiation downwards in reanalysis and climate models
GMCGeneral Climate Model
GPUGeneration Per Unit
GTIGlobal Tilted Irradiance
GWA2Global Wind Atlas version 2
HOLHydropower open-loop pumped storage inflow energy
HPHydro Power
HPIHydropower run-of-river with pondage inflow energy
HPOHydropower run-of-river with pondage generation energy
HPSHydro Pumped Storage
HRGHydropower reservoirs generation energy
HRIHydropower reservoirs inflow energy
HROHydropower run-of-river generation energy
HRRHydropower run-of-river inflow energy
HWSHigh Wind Speed
ICInstalled Capacity
IPCCIntergovernmental Panel on Climate Change
KtClearness index
LOYOLeave-One-Year-Out
MAEMean Absolute Error
MEHRMPI-ESM1-2-HR
MRM2MRI-ESM2-0
NDANon-Disclosure Agreement
NNSENormalized Nash-Sutcliffe Efficiency
NSENash-Sutcliffe Efficiency
NUT0Country level of aggregation
NUT2Sub Country/Provinces level of aggregation
ORIGData that have not been bias-adjusted
PECDPan-European Climate Database
PEOFPan-European Bidding Zones Offshore level of aggregation
PEONPan-European Bidding Zones Onshore level of aggregation
POAPlane Of Array
PVPhoto Voltaic
QGISQuantum Geographic Information System
RFRandom Forest
ReGrResource Grade
SEDACSocioeconomic Data and Applications Center
SFOESwiss Federal Office of Energy
SPVSolar Photovoltaic
SSPsShared Socio-economic Pathways
SZASolar Zenith Angle
SZOFPan-European Zones Offshore level of aggregation
SZONPan-European Zones Onshore level of aggregation
TA2m temperature
TAWPopulation-weighted temperature
TOATop Of the Atmosphere
TPTotal precipitation
TSOTransmission System Operator
UTCCoordinated Universal Time
VMVirtual Machine
WMOWorld Meteorological Organization
WOFWind power offshore
WONWind power onshore
WPPWind Power Plant
WS1010m wind speed
WS100100m wind speed


Introduction

This document presents the technical methodologies and implementation details of the climate and energy indicators included in the Pan-European Climate Database version 4.2 (PECDv4.2). Developed under the Copernicus Climate Change Service (C3S) Energy service, PECDv4.2 has been produced in close collaboration with the European Network of Transmission System Operators for Electricity (ENTSO-E).

...

Files are provided in both NetCDF and CSV formats. For details on the format used for each variable, refer to Table 2.2, Table 2.12, Table 3.4 and Table 3.6.

Descriptions of file naming conventions can be found in Table 4.1, while Table 4.2 and Table 4.3 detail the ancillary NetCDF datasets available via the "Weights and masks" widget.

Note

Please note: this documentation refers exclusively to PECDv4.2. The previous version, PECDv4.1, has been discontinued and will not be extended beyond 2021, as its datasets were frozen ahead of the 2023 European Resource Adequacy Assessment (ERAA), in agreement with ENTSO-E.

Key updates introduced in PECDv4.2

An overview of all the changes and updates that have been implemented in PECDv4.2 compared to PECDv4.1 can be found at the following page:
Climate and energy related variables from the Pan-European Climate Database versions comparison

Workflows

The workflows form the backbone of the PECDv4.2 system, integrating all key components of the data processing chain. Separate workflows have been developed for the two data streams – historical (Figure 1.1) and projections (Figure 1.2). Each workflow covers the generation of both climate and energy indicators, which serve as the foundation for data production, monitoring, and delivery.

...

Figure 1.1: Workflow for the historical stream. All acronyms used in the workflow are listed in a dedicated section entitled "Acronyms and abbreviations" and located at the beginning of this documentation.

...

Figure 1.2: Workflow for the projection stream. All acronyms used in the workflow are listed in a dedicated section entitled "Acronyms and abbreviations" and located at the beginning of this documentation.

 

Historical stream

Data retrieval
Anchor
Section2_1
Section2_1

The workflow illustrating the historical stream is shown in Figure 1.1. ERA5 data from the Copernicus Climate Data Store (CDS) is retrieved via the CDS API (Application Programming Interface), which requires prior installation of Python and the CDS API client. Data is downloaded in monthly chunks by specifying the desired variables and time period.

...

The historical stream of PECDv4.2 includes the following climate indicators: 2 m temperature (TA), population-weighted temperature (TAW), total precipitation (TP), surface solar radiation downwards, 10 m wind speed (WS10) and 100 m wind speed (WS100). Detailed descriptions of these indicators are provided in Section 2.6. Notably, the surface solar radiation downwards corresponds to the global horizontal irradiance (GHI) and is downloaded in hourly values in J m⁻² and converted to W m⁻² by dividing by 3600 seconds.

...

This calculation is implemented using a Python script. Additional guidance is available via the CDS documentation, e.g., ERA5: How to calculate wind speed and wind direction from u and v components of the wind?.

The power law for wind vertical extrapolation
Anchor
Section2_2
Section2_2
 

Wind speed outputs from numerical weather prediction models and climate simulations are typically available at fixed vertical levels, most commonly at 10 m above ground level. For example, the CMIP6 climate projections only provide near-surface wind data. To estimate wind speeds at turbine-relevant heights (e.g., 100 m), vertical extrapolation is necessary. This is achieved using a power law, which expresses wind shear through a dimensionless coefficient known as Alpha (α).

This coefficient enables the conversion of 10 m wind speeds to other heights by accounting for localised vertical wind profiles, as represented in models like ERA5. Temporal variability in wind shear is also considered by stratifying Alpha values by time of day and month, ensuring more accurate height scaling for energy applications.

Alpha computation
Anchor
Section2_2_1
Section2_2_1

The Alpha coefficient was derived using ERA5 wind data from the CDS, specifically the zonal (u) and meridional (v) wind components at both 10 m and 100 m heights, as described in Section 2.1.

The data span an 11-year period from 2011 to 2021, at hourly resolution. This time window was selected as it reflects the most recent and reliable observations assimilated in ERA5, and provides a statistically robust basis for calculating vertical wind shear.

...

The result is a set of Alpha values stratified across 24 hourly intervals and 12 months, capturing diurnal and seasonal variations in vertical wind shear. The final Alpha dataset is stored in NetCDF format and made available via the CDS. For more information, please refer to Table 4.2 and Table 4.3.

Alpha characterization

The diurnal and seasonal variability of the Alpha coefficient across the PECD domain is illustrated in Figure 2.1, which shows the mean Alpha value calculated for each hour and month. These results align with known atmospheric behaviour and prior studies:higher Alpha values occur during the colder, more stable night-time hours, whereas lower values are observed during the daytime, when the atmospheric boundary layer is typically well mixed. Similarly, during winter months, Alpha tends to be higher than in summer, particularly in the central (and warmer) hours of the day.

However, a more nuanced picture emerges when examining the spatial and temporal distribution of Alpha across the domain. Figure 2.2 presents box plots of Alpha values for each hour, aggregated over all grid points in the PECD domain. The plots highlight a wider interquartile range during night-time hours, indicating greater variability in wind shear under stable atmospheric conditions. Notably, the Alpha coefficient can also reach negative values (as low as -0.4), particularly at night, reflecting instances where wind speed decreases with height — a phenomenon associated with specific meteorological conditions.

...

Figure 2.2: Hourly distribution of the Alpha wind shear coefficient across the PECD domain, represented as box plots for each hour (UTC). Boxes show the interquartile range (25th–75th percentile), while whiskers and outliers highlight the spatial variability. Larger spreads at night reflect more variability under stable conditions.

The bias adjustment of the ERA5 wind speed
Anchor
Section2_3
Section2_3

Bias adjustment refers to the process of statistically transforming climate model data to reduce systematic differences between a simulated climate and a reference dataset, usually based on observations, over the historical period. Bias adjustment has become a standard pre-processing step for climate impact studies to adjust climate model output that will drive application models, such as energy models. This is the case for wind speed, a key variable to derive wind power. Specifically, wind power computation depends non-linearly on wind speed (precisely, on its cube). Therefore, significant biases in wind speed can markedly affect the wind energy indicator.

...

Previous evaluations of ERA5 wind speed showed that ERA5 tends to underestimate the intensity of wind speed in most land areas in Europe, except in the North East, while it overestimates wind speed over the sea, particularly in the North Sea and along certain coastlines, such as Southern Norway or Portugal. For this reason, a bias adjustment of ERA5 wind fields is needed. Compared to PECDv4.1, in PECDv4.2 a new methodology was designed to bias-adjust ERA5 wind speeds using the Global Wind Atlas (Davis et al., 2023) as the reference dataset, which is presented in Section 2.3.1, and by applying the Delta Adjustment method, which is described in Section 2.3.2.

Before applying bias adjustment, preliminary corrections (see Section 2.3.3, pre-processing) are also performed on the ERA5 wind speed dataset to address known issues. The effectiveness of these corrections is monitored using four control boxes located in representative regions, as illustrated in Figure 2.3.

Anchor
Figure2_3
Figure2_3
Figure 2.3 The location of the control boxes used to check some known issues in ERA5 wind speed. Blue: France (latitude: 45-47°N; longitude: 5-8°E); magenta: Germany(latitude: 50-53°N;longitude: 6-10°E); orange: Sweden(latitude: 57-61°N; longitude: 13-16°E); green:Finland(latitude: 60.5-63.5°N, longitude: 22.5-26.5°E).  


The Global Wind Atlas
Anchor
Section2_3_1
Section2_3_1

The Global Wind Atlas

Footnote

https://globalwindatlas.info/en/

...

The Global Wind Atlas dataset is created through a downscaling process that begins with large-scale wind climate data (for example, reanalysis) and ends with microscale wind climate data. The dataset combines information from mesoscale and microscale models, as well as from in situ observational sites, to provide refined and verified estimates of mean wind speed at relevant hub heights and at a high horizontal resolution. The WAsP software (Floors and Nielsen, 2019) performs the downscaling and lastly computes local wind climates every 250 m at five heights (10 m, 50 m, 100 m, 150 m, and 200 m) all over the globe, excluding the North and South Poles and offshore areas beyond 20 km. In PECDv4.2, the Global Wind Atlas version 2 (GWA2), which relies on the ERA-Interim reanalysis as input data, was used to bias-adjust the ERA5 wind speeds.

The Delta Adjustment method
Anchor
Section2_3_2
Section2_3_2

To reduce biases in climate models, different bias-adjustment methodologies exist. To adjust the ERA5 wind speed at 10 m and 100 m height, the Delta Adjustment method was selected. This method is one of the simplest and least computationally demanding that applies a constant correction based on the difference between the mean values of the model output (source) and the reference data (target) over a defined historical period (Navarro-Racines et al., 2020). By only accounting for changes in the mean of the quantity of interest, the Delta Adjustment method inherently assumes that the only relevant bias is related to the mean of the distribution. For this reason, the Delta Adjustment is typically used for variables that do not exhibit a strong climate-change-related trend, which is, in general, the case for wind speed.

The Delta Adjustment method was applied to ERA5 wind speeds using the GWA2-derived Delta change factors that correspond, in each grid cell, to the ratio between the mean GWA2 and ERA5 wind speeds over the selected reference period (2006-2018). This scaling ensures that ERA5 wind speed mirrors terrain effects captured by GWA2, while maintaining its spatial-temporal consistency. Specifically, since GWA2 only provides the mean wind speed at each grid cell, the bias adjustment does not modify the diurnal cycle of the original ERA5 data. The resulting bias-adjusted wind speed dataset was then extended backwards and forward to cover the whole ERA5 period, 1950–near present.

Bias-adjustment procedure
Anchor
Section2_3_3
Section2_3_3

The bias-adjustment procedure applied to the ERA5 WS10 and WS100 above-ground involves two steps, detailed below and summarised in Figure 2.4 and Figure 2.5.

Anchor
Figure2_4
Figure2_4

...

that was fixed in PECD at the grid-point level by re-computing the10:00 UTC value through the linear interpolation between the 9:00 UTC and the 11:00 UTC values (equivalent to a temporal average) using the 'interp’ function (method = ‘linear’) included in the xarray Python library. Considering the four geographical control boxes illustrated in Figure 2.3, Figure 2.6 shows the original and corrected mean diurnal cycles of the 10 m wind speed computed over the period 2009-2018.

...

Figure 2.6: Effect of the correction of the 10:00 UTC drop in WS10 in each of the four geographical control boxes shown in Figure 2.3. Blue line: original dataset, orange line: corrected dataset.

Regarding GWA2, the Global Wind Atlas wind speeds were selected at the same heights of ERA5 wind speeds (namely, 10 m and 100 m), then averaged over the period 2006-2018, and finally interpolated from their original (250 m) up to the ERA5 horizontal resolution (0.25°) using the 'coarsen' function inlcuded in the xarray Python library. The resulting NetCDF files, containing the mean wind speed of GWA2 and ERA5 at both 10 and 100 m, are described in Table 4.3 and are available for the download on the CDS.

2) Bias Adjustment: Following the methodology described in Section 2.3.2 and summarised in Figure 2.5, the ERA5 WS10 and WS100 were corrected using GWA2 as the reference (also called target) dataset. 

Validation of the ERA5 bias-adjusted near-surface wind speeds

Despite the limited availability of long-term and homogeneous wind observations to assess wind fields (Davidson and Millstein, 2022), over Europe the E-OBS dataset

...

offers land-only, station-based, daily means of near-surface (at 10 m above ground) wind speed, at the same horizontal resolution of ERA5 (0.25°) and over the period 1980-2022 (de Baar et al., 2023). The domain of the E-OBS dataset partly overlaps the PECD domain, providing a common area that stretches between the following coordinates: latitudes from 30°N to 72°N, and longitudes from 12°W to 40°E. Using the E-OBS observational gridded dataset as a reference for assessment, the ERA5 bias-adjusted near-surface wind speeds were evaluated.

Figure 2.7 illustrates the spatial distribution of the absolute bias in global means of near-surface (10 m) wind speed computed over the period 1995-2014. The absolute bias corresponds to the difference between the ERA5 reanalysis (before, ERA5_ORIG, and after, ERA5_BA, the bias adjustment) and the E-OBS dataset. The bias adjustment reduces the mean bias between E-OBS and ERA5 wind speeds, with the mean absolute bias moving from 0.55 m s-1(mean relative bias: 23.76%) to 0.41 m s-1(19.41%). The effect of bias adjustment is stronger over north-eastern Europe, where the bias reduces by nearly 1 m s-1, with a final bias lower than 0.5 m s-1 (first vs. second plot in Figure 2.7). Instead, over mountainous regions (for example, the Alps or the Carpathian Mountains) and areas with complex terrain mixing steep slopes and coasts (for example, Norway or the Balkans), the bias increases once wind fields have been bias-adjusted (first vs. second plot in Figure 2.7). Over these regions, the bias in ERA5 bias-adjusted wind speeds shows a similar pattern to the difference between GWA2 and E-OBS, suggesting that over complex terrains, ERA5 inherits the micro-scale information from GWA2 that E-OBS does not provide (second vs. third plot in Figure 2.7).

Anchor
Figure2_7
Figure2_7

...

Looking at the temporal correspondence between ERA5 and E-OBS, Figure 2.8 shows the time series of monthly mean wind speeds computed over the period 1995-2014 and over the European domain presented in Figure 2.7. The original ERA5 already captures the temporal variations in wind speed, including the succession of high and low values. The bias adjustment improves the temporal correlation, with the square of the Pearson's coefficient (R2)increasing from 0.67 to 0.72

...

, and brings ERA5 closer to EOBS, with the mean bias decreasing from 0.51 to 0.41 m s-1(Figure 2.8). Moving to the regional scales, Figure 2.9 shows the time series of monthly means and confirms that the bias adjustment has a stronger effect over north-eastern Europe. Over Germany, the mean bias between EOBS and ERA5 shows a similar absolute value before and after the bias adjustment (0.17 m s-1before and -0.19 m s-1after), while over Finland the mean bias decreases from 0.86 m s-1to 0.01 m s-1. Moreover, over Germany ERA5 shows a higher temporal correlation with E-OBS (R2= 0.98) compared to Finland, where some years perform worse than others (R2= 0.35). This is the case for the year 2010, which is highlighted in yellow on Figure 2.9. For this year, Figure 2.10shows the tight temporal correspondence between E-OBS and ERA5 daily means over Germany, while some discrepancies appear over Finland.

...

Figure 2.8: Time series of monthly means of near-surface wind speeds (units: m s-1) computed over the period 1995-2014 and over the European domain illustrated in Figure 2.7. The five solid lines show: (a) the E-OBS dataset (EOBS, green line), (b) the original ERA5 reanalysis (ERA5_ORIG, red), (c) the bias-adjusted ERA5 reanalysis (ERA5_BA, blue), (d) the difference between ERA5_ORIG and EOBS (grey), and (e) the difference between ERA5_BA and EOBS (pink).

Anchor
Figure2_9
Figure2_9

Figure 2.9: As Figure 2.8 for two regions located in: (a) Germany (latitudes [50°N-53°N] and longitudes [8°E-12° E], purple box on Figure 2.7; left plot) and (b) Finland (latitudes [64°N-68°N] and longitudes [26°E-30°E]; blue box on Figure 2.7; right plot). The year 2010 is highlighted with a yellow stripe and has been chosen to illustrate the time series of daily means.

...

Figure 2.10Time-series of daily means of near-surface wind speeds (units: m s-1) for the year 2010 computed over two regions located in: (a) Germany (latitudes [50°N-53°N] and longitudes [8°E-12° E], purple box on Figure 2.7; top plot) and (b)Finland (latitudes [64°N-68°N] and longitudes [26°E-30°E]; blue box on Figure 2.7; bottom plot). The five solid lines show: (a) the E-OBS dataset (EOBS, green line), (b) the original ERA5 reanalysis (ERA5_ORIG, red), (c) the bias-adjusted ERA5 reanalysis (ERA5_BA, blue), (d) the difference between ERA5_ORIG and EOBS (grey), and (e) the difference between ERA5_BA and EOBS (pink). 

Footnotes Display
 

Population-weighted Temperature
Anchor
Section2_4
Section2_4

Population-weighted temperature (TAW) is an important climate indicator included in the PECDv4.2 database. It is particularly relevant for energy conversion and demand modelling, as it provides a temperature metric that reflects the conditions most likely experienced by the population. Rather than averaging temperature uniformly across a region, TAW gives greater weight to areas with higher population density, offering a more realistic estimate of population exposure to temperature variations.

In the PECD framework, TAW is calculated exclusively at the SZON (onshore bidding zones) aggregation level (see Table 2.1for a full list of spatial aggregation levels and their acronyms). This approach allows for a consistent integration of TAW into energy-related applications, such as forecasting demand peaks during heatwaves or cold spells, assessing vulnerability, or planning adaptive infrastructure and policy interventions.

Population mask
Anchor
Section2_4_1
Section2_4_1

To calculate TAW, a high-resolution population mask is required. For PECDv4.2, gridded population data at 0.25° spatial resolution were sourced from the NASA Socioeconomic Data and Applications Center (SEDAC)

...

The population raster was clipped to the PECD domain and converted to NetCDF format (Figure 2.11), using QGIS-GRASS GIS (Geographic Information System, Open-Source Geospatial Foundation Project

...

). Sea and ocean areas were assigned missing values in accordance with the ESRI ASCII specification. The resulting NetCDF population mask is used throughout the modelling chain and is available for download via the CDS  (please refer to Table 4.2 and Table 4.3 for more details).

Anchor
Figure2_11
Figure2_11

...

Figure 2.11: Population distribution across the PECD domain based on NASA SEDAC data (2020), mapped at 0.25° resolution. Values represent the number of inhabitants per grid cell.


Computation of Population-weighted temperature
Anchor
Section2_4_2
Section2_4_2

TAW [°C] is computed by applying the population mask to the gridded TA, both at 0.25° resolution. The calculation is carried out independently for each onshore bidding zone (referring to the aggregation level SZON, as detailed in Table 2.1), using the following equation:

...

 is the population in the i-th grid cell of zone z, and n is the number of grid cells in the zone. This results in a weighted average temperature for each zone, reflecting human exposure rather than geographic extent alone.

Figure 2.12 shows the difference between the mean TAW and the mean TA over the period 1991-2020 across the SZON regions.

...

Figure 2.12: Difference between the mean TAW and TA over the climatology 1991-2020 for SZON regions.

Spatial aggregation
Anchor
Section2_5
Section2_5

Spatial aggregation is the procedure used to compute regionally averaged indicators from gridded climate and energy data. It enables the transformation of high-resolution outputs into meaningful statistics for specific administrative or market-related regions, such as countries, provinces, or bidding zones. This process is systematically applied to all gridded indicators in PECD to produce corresponding aggregated versions.

Note

Please note that on CDS, the sub-region selection is only available for gridded datasets. When downloading aggregated time series from CDS, the sub-regional extraction is not supported.

Required spatial aggregation level for PECDv4.2
Anchor
Section2_5_1
Section2_5_1

The PECD database supports multiple levels of spatial aggregation, depending on the needs of climate and energy modelling. Table 2.1 below summarises these levels, along with their codes and source definitions.

...

and pan-European regions, official shapefiles were provided by ENTSO-E. Figure 2.13 shows some of the shapefiles used to create the masks.

...

CodeDescription of the aggregation levelSource
ORIGNot aggregatedGridded data
BIASNot aggregatedGridded data bias-adjusted (see Section 2.3)
NUT0CountryNUTS0+ADMIN0
NUT2Sub Country/ProvincesNUTS2+ADMIN1
SZONOnshore Bidding Zones Shapefile provided by ENTSO-E*
SZOFOffshore Bidding ZonesShapefile provided by ENTSO-E*
PEON

Pan-European Onshore Zones

Shapefile provided by ENTSO-E*
PEOF**Pan-European Offshore Zones Shapefile provided by ENTSO-E*
CITYNot aggregated - List of selected cities (only for TA)List provided by ENTSO-E

*These shapefiles are not publicly available, but the corresponding NetCDF masks are provided in the CDS under the widget "Weights and masks". Please see Table 4.2 and Table 4.3 for more details.

**In PECDv4.2, the PEOF zones were updated from previous versions by considering a new version of the shapefile (2024/09/19).

...

Figure 2.13: Example of original polygon geometries used to derive float masks for spatial aggregation.


Footnotes Display

Generation of Region Masks for Spatial Aggregation
Anchor
Section2_5_2
Section2_5_2

To perform spatial aggregation, floating-point NetCDF masks were generated from the shapefiles listed in Table 2.1. One mask was created for each aggregation level, resulting in six region masks: NUT0, NUT2, PEON, PEOF, SZON, and SZOF.

...

These masks allow for accurate area-weighted aggregation, especially near borders and coastlines. An example for Italy (NUT0 level) is shown in Figure 2.14.

All regional masks are available for download from the CDS under the widget “Weights and masks”. Additional details about filenames and structure can be found in Table 4.2 and Table 4.3.

Anchor
Figure2_14
Figure2_14

...

Figure 2.14: Example of a float mask, for the Italian NUT0 administrative region, showing the fractions of land around the border and coastlines.

Spatial Aggregation Procedure
Anchor
Section2_5_3
Section2_5_3

  1. The spatial aggregation of climate and energy indicators is implemented via a Python-based tool, following this workflow:

    1. Input loading:

      • Load the NetCDF file containing the variable(s) to be aggregated.

      • Load the corresponding region mask (NetCDF format).

    2. Grid iteration:

      • Iterate over the spatial coordinates defined in the region mask.

      • For each region:

        • Apply the region mask to the gridded data, applying a cosine latitude weighting to account for spatial distortion (see Table 4.2 and Table 4.3 for more details).

        • Compute the weighted average over the masked area.

    3. Result formatting:

Climate indicators
Anchor
Section2_6
Section2_6

This section describes the climate indicators provided in PECDv4.2 for the historical stream. These indicators are derived from the ERA5 reanalysis and are used as inputs for energy modelling and climate analysis across the Pan-European domain.

...

The indicators are available as both gridded products (NetCDF format, spatial resolution of 0.25° × 0.25°) and spatially aggregated time series (CSV format), depending on the level of aggregation.

Table 2.2 summarises the available climate indicators, including their temporal coverage, data source, domain and spatial resolution, temporal resolution, aggregation levels (see Table 2.1), and units. Notably, PECDv4.2 introduces a new reference level — CITY — which provides temperature time-series for a predefined list of cities (see Section 2.5).

Anchor
Table2_2
Table2_2

Table 2.2: Climate indicators provided in the PECDv4.2 for the historical stream.
Gridded data (ORIG and BIAS levels) are provided in NetCDF format. All other aggregation levels are delivered in CSV format. Changes that were implemented in PECDv4.2 are highlighted in bold (extended time period and the CITY level).

VariablePeriodSourceDomain / Spatial ResolutionTemporal ResolutionSpatial AggregationUnits
2m temperature (TA)1950 - near presentERA5 reanalysisPECD/0.25° x 0.25°hourlyORIG, NUT0, NUT2, SZON, SZOF, PEON, PEOF, CITY

K (gridded)

°C (aggregated)

Population-weighted temperature (TAW)1950 - near presentERA5 reanalysisPECD/0.25° x 0.25°hourlySZON°C
Total precipitation (TP)1950 - near presentERA5 reanalysisPECD/0.25° x 0.25°hourlyORIG, NUT0, NUT2, SZON, SZOF, PEON, PEOFm
Surface solar radiation downwards (GHI)1950 - near presentERA5 reanalysisPECD/0.25° x 0.25°hourlyORIG, NUT0, NUT2, SZON, SZOF, PEON, PEOFW m-2
10m wind speed (WS10)1950 - near presentERA5 reanalysisPECD/0.25° x 0.25°hourlyORIG, BIAS, NUT0, NUT2, SZON, SZOF, PEON, PEOFm s-1
100m wind speed (WS100)1950 - near presentERA5 reanalysisPECD/0.25° x 0.25°hourlyORIG, NUT0, NUT2, SZON, SZOF, PEON, PEOFm s-1

Energy data
Anchor
Section2_7
Section2_7

In collaboration with ENTSO-E, extensive efforts have been made in PECDv4.2 to collect and integrate the widest possible range of energy-related data, which serve both for validating energy models and for training the hydro statistical model. The following sources have been used:

...

5) TSO-provided inflow data: Several Transmission System Operators (TSOs) have provided confidential data under non-disclosure agreements (NDAs). These data include high-resolution generation and storage series, and are not detailed in this documentation.

Footnotes Display

Exclusion areas
Anchor
Section2_8
Section2_8

To ensure that wind and solar energy potential is realistically assessed, several exclusion layers have been applied in PECDv4.2. These masks identify areas unsuitable for energy production due to geographical, environmental, or legal constraints.

...

All exclusion layers have been processed into NetCDF format and are available for download in the Climate Data Store (CDS) under the widget "Weights and masks". These include the "Wind power regions mask" and "Solar PV mask" used in PECDv4.2. For details on file naming conventions and characteristics, refer to Table 4.2 and Table 4.3.

Table 2.3 below provides a full overview of each exclusion criterion, including data sources and variable identifiers.

...

CriteriaDescriptionSourceVariable Name
Protected areasIdentifies legally protected regions, such as national parks or nature reserves. Values are binary: 1 indicates a restricted pixel.

World database on protected areas from the United Nations Environment Programme

prot_a
Polar areasIdentifies polar and subpolar regions based on global land cover classification. Values are binary: 1 indicates a restricted pixel.

Land Cover Classification System from the United Nations Food and Agriculture Organization

polar_a
Urban areasFlags areas with urban coverage ≥ 45%. Values are binary: 1 indicates a restricted grid cell.

Land Cover Classification System from the United Nations Food and Agriculture Organization

urban_a
Water and continental waters areaClassifies water bodies using a three-value system: 0 = land, 1 = ocean, 2 = inland waters. Used to exclude non-land areas.ERA5 Land-Sea Mask (ECMWF)watr_a
High slope areaIdentifies steep terrain where slope ≥ 60%. Values are binary: 1 indicates a restricted grid cell.

ETOPO1 Global Relief Model from National Oceanic and Atmospheric Administration (NOAA)

halo_a
High elevation areasIdentifies regions of high altitude. Values are binary: 1 indicates a restricted pixel.

ETOPO1 Global Relief Model from National Oceanic and Atmospheric Administration (NOAA)

hele_a
Distance to shore areasIdentifies areas beyond a given distance from the coastline (used primarily for offshore applications). Values are binary: 1 indicates a restricted pixel.

ERA5 Land-Sea Mask (ECMWF)

dist_s


Energy Conversion Models
Anchor
Section2_9
Section2_9

This section outlines the methodologies and implementation details of the energy conversion models used in PECDv4.2, which replaces the previous PECDv4.1 version. These models convert meteorological inputs into power generation time series for four technologies: wind power, solar photovoltaic (SPV), concentrated solar power (CSP), and hydropower. The first three are physical models, while the hydropower module is based on a statistical machine learning approach.

For each energy model, we describe the input data sources, the modelling framework, and the calibration and validation methods used.

Wind Power Conversion Model
Anchor
Section2_9_1
Section2_9_1

The wind power conversion model simulates generation at the wind power plant (WPP) level and aggregates results to the regional level. The conversion process differs between existing and future wind power installations, reflecting the evolution of wind technologies over time. Existing installations are modelled based on location, capacity, and technology data from WindPowerNet

...

In PECDv4.2, the model has undergone several updates compared to PECDv4.1, including the use of higher-resolution wind climatology data, a more flexible turbine modelling framework, and improved treatment of unavailability. These improvements enhance the realism and accuracy of the simulated wind generation time series.

Climate Data Handling

Wind speed data is sourced from the ERA5 reanalysis and CMIP6 climate model outputs, provided at 0.25° x 0.25° horizontal resolution and, when available, at two vertical levels: 10 m and 100 m above ground.

...

Interpolation is carried out at each time step and for each wind power plant (WPP) individually.

Wind bias adjustment in PECDv4.2

A key methodological improvement in PECDv4.2 is the use of the Global Wind Atlas version 2 (GWA2) for wind speed bias adjustment, replacing the COSMO-REA6 dataset used in PECDv4.1.

...

This two-step process—interpolating ERA5/projection data and adjusting with GWA2 climatology—produces more realistic site-specific wind speed time series and improves alignment with observed power generation, in line with the findings of Murcia et al. (2022).

Conversion to Wind Power Generation
Existing installations

A power curve is estimated for each wind power plant (WPP) using a surrogate model, as detailed in Simutis et al. (2024). The model first constructs a turbine-level power curve from plant-level characteristics and then accounts for intra-farm wake effects to generate a plant-level curve for use in simulations.

This method enables the derivation of a specific power curve for each WPP. Comparisons with turbine-level data from the WindPowerNet database show good agreement, although the generic model excludes the storm shutdown regime (Figure 2.15), which is handled separately.

Figure 2.16 illustrates the surrogate modelling process and its supported parameter space, which covers current European installations and allows for a wide range of future configurations. Air density is fixed at 1.225 kg/m³. Turbulence intensity is set at 10% for onshore and 5% for offshore simulations.

...

Figure 2.16: Overview of the methodology for estimating a plant-level power curve for each WPP, and finally simulating the power generation time series (here for the historical period). Figure is taken from Simutis et al., 2024.

Future Installations

For future onshore wind installations, turbines with specific powers ranging from 198 to 335 W/m² are used, as indicated in Swisher et al. (2022). For offshore wind, turbines with specific powers of 316 and 370 W/m² are simulated. An overview of the simulated future wind technologies is given in Table 2.4 and Table 2.5, which also list the corresponding options found in the widget "Technological specification" in the download form. Each wind technology option is labelled with a number representing a specific combination of hub height (HH) and specific power (SP). For example, "21 (SP316 HH155)" refers to offshore wind power with a specific power of 316 W/m² and a hub height of 155 m. These labels allow users to easily select the desired wind turbine specification from the dataset.

...

The power curve model, as presented in the previous section, is made available in the GitLab repository mentioned above. This allows users to generate plant-level power curves for any combination of specific power, hub height, and plant size, provided they fall within the supported range shown in Figure 2.16.

Anchor
Table2_4
Table2_4

Table 2.4: Future technology of onshore wind turbines.

...

Specific Power [W/m2]

Rotor Diameter [m]

Hub Height [m]

Rated Power [MW]

Correspondent codes in the download form on CDS

316

269

155

18

21 (SP316 HH155)

370

249

155

18

22 (SP370 HH155)


Storm Shutdown

Storm shutdown behaviour is modelled as described in Murcia et al. (2021), applying a direct (non-controlled) shutdown for all existing wind power plants (WPPs), using data from the WindPowerNet WPP installation database for the shutdown wind speeds. For future wind technologies, a 25 m/s cut-off is assumed for onshore wind installations, and the HWS (High Wind Speed) Deep type from Murcia et al. (2021) is used for future offshore wind installations (as in the PECD 2021 update). The shutdown procedure is modelled as a 'hysteresis,' where a restart occurs only after the wind speed has dropped to a sufficiently low value for a restart to take place (see Figure 2.17). The storm shutdown is a dynamic model that captures three aspects:

  1. Individual wind turbine shutdown and restart as each turbine experiences wind speed fluctuations that can exceed 25 m/s (10-minute mean cut-off wind speed), depending on the duration of exceeding the limits, as illustrated in Figure 2.17.
  2. Plant shutdown does not occur in the same manner as individual turbines; not all turbines in a plant shut down simultaneously, as each turbine experiences slightly different wind speeds at a given time.
  3. The restart operation happens only at a somewhat lower wind speed than shutdown to prevent cycling between shutdown and restart when the wind speed hovers around the shutdown wind speed (e.g., 25 m/s). More details are provided in Murcia et al. (2021).

...

Figure 2.17: Single-turbine storm shutdown for two storm shutdown technologies. The different shutdown limits (up to 1 s) have been considered in detailed simulations, but a simplified plant-level behaviour (Murcia et al., 2021) is used for the simulations in this service. Figure taken from (Murcia et al., 2021).


Footnotes Display

Simulated locations and wind technologies

The simulated locations and wind technologies depend on the type of run. An overview of the runs is given in Table 2.6.

Anchor
Table2_6
Table2_6

Table 2.6: Wind run types.

Run type

ERA5 simulated years

Climate projection simulated years

WPP locations

WPP technology

Losses

Validation
(for validation only, not delivered
)

2015 - 2022

Not simulated

Changed every year to match changing WPP installations (based on WindPowerNet data)

Existing WPP parameters based on WindPowerNet data (changed every year), applied in the generic power curve model

Wakes as part of the generic power curve. Other losses (incl. unavailability) are applied as a simple multiplication for onshore, but as a stochastic process for offshore wind.

Existing

1950 - near present

2015 - 2100

All years with 2020 WPP locations (based on WindPowerNet data)

Existing WPP parameters based on WindPowerNet data (always 2020 fleet), applied in the generic power curve model

Wakes as part of the generic power curve. Other losses (incl. unavailability) are applied as a simple multiplication for onshore, but as a stochastic process for offshore wind.

Future wind technologies

1950 - near present

2015 - 2100

The best 10-50 % locations (ReGrB) of the unmasked points within each PECD region (in terms of mean wind speed in the bias-adjusted ERA5 data, based on ERA5 grid). A separate run considering only the best 10 % locations (ReGrA) is also provided.

Onshore wind: 3 hub heights and 3 turbine types, so in total 9 wind technologies. A plant of 50 MW with ten 5 MW turbines modelled for each technology.

Offshore wind: 1 hub height and 2 turbine types, so in total 2 wind technologies. A plant of 500 MW with 28 18 MW turbines modelled for each technology.

Wakes as part of the generic power curve. Other losses (incl. unavailability) are applied as a simple multiplication for onshore, but as a stochastic process for offshore wind.


Some notes on Table 2.6:

  1. All wake modelling considers only intra-farm wake effects (no interaction between separate wind plants).
  2. Literature suggests a range of 5 % to 10 % for the other losses (Mortensen, 2018). The existing installations cover historical installations over tens of years with older technology, whereas the future installations are new installations (no wear-and-tear considered) with modern technology; it was thus considered fair to place them at the opposite sides of the loss range (5 % for new technologies and on average 10 % for existing installations).
  3. A mask is used to find the potential points for the Future wind technologies runs. This mask ("wind power regions mask") is available for download in the CDS.  Please refer to Table 4.2 and Table 4.3 for more info.
  4. Locations of existing wind power plants are not considered in the assessment of the 10-50 % best locations for each region. This is done because the decommissioning of old turbines is expected to free up more space for new installations in the future.
  5. The assumed locations of wind power plant installations within a region significantly impact the expected capacity factor on the aggregate level (Swisher et al., 2022). PECDv4.1 accounted for a single ‘resource grade’ (ReGr), which considers the 10–50% best locations; therefore, no selection option appeared on the CDS download page. In contrast, PECDv4.2 offers the possibility to select between two separate simulations covering the 10% best locations (ReGrA) and the 10–50% best locations (ReGrB). In the future, simulations accounting also for the 50% worst locations—or, in principle, any other distribution split between 0 and 100%—could be provided in later versions of PECD in consultation with ENTSO-E. However, this would multiply the amount of future wind technology time series.

In addition to the plant-level power curves, information on the existing wind power installations is required to simulate generation from the existing fleet. Data from WindPowerNet are used, with the missing technical parameters (turbine type and hub height) estimated based on the machine learning approach from Koivisto et al. (2021). Wind power plants without location or installed capacity information are removed. An overview of the installed capacities (2020 fleet) and key WPP technical parameters for onshore and offshore installations is shown in Figures 2.18 to 2.21.

Anchor
Figure2_18
Figure2_18

...

For future wind installations, the starting point is the ERA5 grid points. Based on the exclusion layerspresented in Section 2.8 (specifically, the "wind power regions mask"; please refer to Table 4.2 and Table 4.3 for more info), masking is then applied to these points to select potential future WPP locations. The potential points are shown in Figure 2.22. After selecting the 10-50% best points (referred to as ReGrB, based on 100 m mean wind speeds), the resulting final future installation simulation points can be seen for onshore wind in Figure 2.23 and for offshore wind in Figure 2.24. The selection of 10-50% best points is the average ‘resource grade’ selection following from the work done in Swisher et al. (2022), whereas the best 10 % of locations (ReGrA) represents the best wind sites.

...

Figure 2.24: Offshore locations for the future technology runs for the average resource grade (10-50 % best locations, ReGrB) (left) and the best 10 % of locations (ReGrA) (right). Colouring shows the mean wind speed (m/s) of each location. The locations of existing wind power plants and the repowering of existing plants are not considered.

Aggregation to the regional level

After simulating the hourly generation for each WPP, the results are aggregated at the regional level. For existing installations, regional aggregation is weighted by the installed capacity of each WPP. For future technologies, the same weight is used for each location. From a processing point of view, temporary NetCDF files are used, but the final regional results are saved as CSV files. In addition to power generation, similar weighted regional wind speed averages are saved.

Input data modifications based on measured data and TSO feedback

There is always uncertainty related to the technical wind power plant input data. While the data from www.thewindpower.net is extensive, most countries have significant missing data (most importantly hub height and turbine specific power), so it was considered reasonable to modify the inputs to some extent. E.g., for hub heights, around 60 % of the wind power plants had missing hub height data for Portugal, with 80 % missing for Spain and 70 % for Italy. For specific power, the respective missing data shares were 10 %, 20 %, and 20 %. The modifications are shown below. They were presented for and agreed by ENTSO-E and the respective TSOs. The modifications lead to better fit to measured data. They are done only for the Validation and Existing runs.

...

The wind installation density (MW/km2) data were not found for the different European countries. Installation density is expected to vary based on land availability (Murcia et al., 2022), with countries with a lot of existing wind installations and relatively high population density (e.g., Germany) showing higher density of installations due to land availability constraints. The density is assumed to vary from 15 MW/km2 in Germany (by far the most existing wind installations per km2 in Europe), to 10 MW/km2 in countries with significant wind installations and generally high population density (Belgium, France, Ireland, Luxemburg, Netherlands and the UK) and to 4 or 7 MW/km2 for the other counties. The 4 MW/ km2 installation density is assumed for counties with more limited wind installations or lower population density and thus significant available land (e.g., Bulgaria, Estonia and Finland), with 7 MW/km2 assumed for the rest (e.g., Austria, Denmark and Italy). These installation densities are assumed for both existing and future installations.

Footnotes Display

Photovoltaic Solar Power Conversion Model
Anchor
Section2_9_2
Section2_9_2

To estimate photovoltaic (PV) capacity factors at the regional level, a flexible yet robust modelling workflow has been implemented (Saint-Drenan et al., 2018). The method starts by modelling PV output at the individual system level, taking into account the specific location and the orientation and tilt of the module’s plane-of-array (POA). This system-level output is then scaled up to regional values by averaging over a representative set of possible POA configurations, weighted according to metadata from real-world PV installations.

Temporal downscaling

The PV modelling workflow applies a two-step temporal downscaling process to solar global horizontal irradiance (GHI) and 2-metre temperature (TA), ensuring compatibility with the high temporal resolution required for accurate PV capacity factor calculations.

...

A second downscaling step is applied to both historical and projected datasets: the hourly GHI and TA values are further interpolated to a 15-minute resolution. PV capacity factors are calculated at this finer time step and then averaged back to hourly values. This process allows for better representation of diurnal solar variability and reduces artefacts that might result from simpler interpolation methods.

Inferring plane-of-array irradiance: decomposition and transposition

To estimate the irradiance on a photovoltaic module’s POA, it is necessary to convert GHI, which is measured on a horizontal surface, into irradiance on a tilted surface. This requires separating GHI into its two components—direct and diffuse solar radiation—using a decomposition model.

...

This decomposition and transposition process ensures a more realistic estimation of solar radiation on tilted surfaces, which is essential for modelling PV system performance.

PV modelling: optical losses, conversion efficiency, temperature and inverter losses

Before modelling the PV energy conversion process, optical losses due to reflection on the module’s surface must be accounted for. These are modelled using the Martin-Ruiz model (Martin & Ruiz, 2001, 2013), which relates surface reflectance to the angle of incidence and the properties of the PV module’s glazing.

...

This model does not include the effects of module degradation, inverter clipping, or variability in module characteristics (e.g., efficiency, temperature coefficient). However, these simplifications are acceptable given the lack of detailed data on the PV systems installed across the PECD region. Moreover, uncertainties introduced at the plant level are significantly mitigated by the spatial averaging applied during regional aggregation.

Upscaling to Regional PV Aggregation
Modelling PV Typologies

Photovoltaics is a flexible technology that can be installed in many different contexts, which then define the characteristics of a PV installation. Compared to PECDv4.1, which did not consider different PV technologies, PECDv4.2 comprises four different implementations, here designated as typologies. Sharing the same physical modelling framework, each typology is characterised by specific module tilt, azimuth, and thermal losses.

Residential rooftops

By exploring various PV databases, it was possible to infer a first correlation between the latitude of a given location and the mean tilt of installations smaller than 9 kW (here assumed as a proxy for residential PV), which was parametrised as a set of linear functions (Figure 2.25). As it was not possible to collect data for latitudes below 42º and above 54º, it was assumed that the minimum and maximum tilt defined for the covered latitudes stabilise outside of the available range.

...

Because it is quite common for rooftop PV to be installed immediately over the rooftop and sharing its tilt, minimising their aesthetic impact and infrastructure costs, they suffer from a lack of convective cooling on their back side. This is parametrised as a Ross coefficient of 0.34 when calculating PV module temperature (Skoplaki et al., 2009), which for a sunny, hot day – ambient temperature of 30ºC and an incident solar radiation of 1000 W.m-2 – leads to 1.8% higher thermal losses than in PECDv4.1.

Industrial Rooftops

Unlike residential rooftops, industrial rooftops are often flat, providing greater flexibility in engineering design. However, their typically high electricity demand favours installations with very low tilt angles, since it reduces the shading between rows of modules and enables a higher energy density (kWh/m2) at the expense of a lower capacity factor (kWh/kWp).

An exploratory analysis of collected empirical data, along with discussions with PV companies, suggests that these installations are typically mounted with a 10° tilt and exhibit a more varied azimuth than residential rooftops. Consequently, this typology assumes that 50% of installations are oriented southward, while 25% face east and 25% west. These installations are assumed to have lower convective cooling and, thus, the same thermal modelling as for residential rooftop installations.

Utility-scale Fixed
Anchor
USF
USF

Module tilt and orientation data for utility-scale PV were inferred from the metadata of hundreds of installations located in France and Germany, for which we can access quality data. Although the PECD spatial coverage is considerably larger, this kind of information is deemed highly sensitive by the industry, making it uncommon to find as open data.

To circumvent the already mentioned limited geographical coverage of the collected data, the tilt was normalised by its theoretical optimum tilt (which maximises the incident irradiation), so that the resulting distribution is more generalizable in space. To calculate the optimum tilt for the PECD region, the PV generation is estimated for different tilts, always south-oriented, using ERA5 data between 2015-2020; the tilt resulting in the highest overall irradiation is selected (Figure 2.26).

Anchor
Figure2_26
Figure2_26

...

Figure 2.26: Optimal tilt angle, which maximises annual yield, over the PECD domain calculated considering a 5-year period of ERA5 data.


Based on Figure 2.27,  which compares the optimal tilt estimated from ERA5 data with the tilt from actual installations, the utility-scale fixed PV in PECDv4.2 assumes, in each grid cell, a tilt equal to 75% of its theoretical optimum. This discrepancy is most likely due to a common engineering practice: by reducing the shadowing between rows of modules, a higher energy density – i.e., a higher kWh generated per unit of area – and lower land requirements can be achieved. The tilt ratio and orientation are visualised as a 2D histogram, which shows that both parameters can be well described by two normal distributions (Figure 2.28).

Anchor
Figure2_27
Figure2_27

...

Figure 2.28: Empirical distribution of the PV installations' tilt and azimuth angles, with the first being relative to the local optimal tilt angle, as well as inferred normal distributions.


Utility-scale Tracking

Among the various tracking configurations, PECDv4.2 includes only horizontal single-axis tracking (HSAT), as it is currently the most widely adopted. In this setup, PV modules are mounted horizontally on a single axis aligned North-South, rotating daily from East to West as the day progresses. This configuration increases capacity factors, particularly in summer and when the Sun is higher in the sky. The tracking modelling, implemented using the pvlib Python library, also accounts for back-tracking, a widely used strategy that adjusts module positioning when the Sun is low to minimise shading between rows, even if it deviates from the optimal angle for energy capture. This is done considering a ratio between the PV array area and the corresponding land use, or equivalently, the inverse of the axis spacing, of 0.35.

While the movement of the modules may impact their thermal performance throughout the day, for PECDv4.2 this typology was assumed to share the same assumptions as the Utility-scale fixed case.

Application of Exclusion Areas and Spatial Aggregation

Once the PV capacity factor product is generated for the PECD-constrained ERA5 grid, regional estimates for bidding and study zones are calculated by means of a spatial average. However, it is important to note that particular (restricted) areas were masked in both the grid-like and regional-based products to produce more accurate results. Specifically, sea and ocean areas (thus, offshore PV), polar and protected areas, as well as locations with high elevation (above 2000 m a.s.l.) or slope (higher than 10%) were excluded from the computation (please refer to Table 4.2 and Table 4.3 for more details). While high elevation may be unsuitable as an exclusion criterion at a global scale (notably for Chile), we found that for the PECD area, this does not pose issues in terms of final PV estimates. Figure 2.29 shows the composite exclusion mask considered for the computation of the solar photovoltaic technologies. The information to identify such regions was obtained from a range of sources: the ERA5 Land-sea mask, the Copernicus Land cover classification gridded maps, the World Database on Protected Areas (WDPA) and Other Effective Area-based Conservation Measures (OECM), and the ETOPO1 bathymetric and topographic digital elevation model.

...

Figure 2.29: Composite exclusion mask considered for the solar photovoltaic technologies.


Making use of typology-level data

ENTSO-E’s adequacy studies are based on the integration of the capacity factor data from the PECD with the structural data of the European power system – such as installed capacities by technology – submitted by TSOs to the Pan-European Market Modelling Database (PEMMDB). Thus, to align with the increased granularity of PECD version 4.2, TSOs began, in 2024, reporting installed PV capacity per typology. While the first data collection indicated a generally positive adoption of the new framework, it may still present challenges for TSOs, as it requires more detailed technological roadmaps. At the same time, it offers an opportunity to not only simulate pre-defined energy systems, but also to test and compare alternative technological scenarios. From a broader perspective on the potential end users of this data, complementing these typology-level timeseries with cost assumptions could support more detailed and realistic energy optimization studies.

...

This challenge has been clearly identified with the release of PECDv4.2 and has been under investigation. Future efforts will focus on a more extensive data collection and validation, both at the typology- and aggregated-level.

Improvements over Previous Methodology

PECDv4.2 introduces a significant shift in the modelling of regional-level PV timeseries. While previous versions relied on a single model, the current version acknowledges the diversity of PV implementations. It establishes a base modelling framework that incorporates specific parameterisations tailored to different installation types, ensuring a more accurate representation of PV technology variations.

In particular, it accounts for variations in tilt and azimuth angles, which affect both the daily and seasonal generation profile, as well as optical losses from reflection. Additionally, it considers ventilation conditions, which influence module temperature and, consequently, thermal losses.

Concentrated Solar Power Conversion Model
Anchor
Section2_9_3
Section2_9_3

As specified in the work plan, the concentrated solar power (CSP) model developed by DTU and used in the previous version of PECD is also employed in this release. A brief description of the model is provided below.

...

  1. If the solar field generates more energy than required to operate at rated power, the surplus is stored.

  2. If the solar field generates less, the storage discharges energy to maintain rated power (see Figure 2.30).

This strategy does not require knowledge of market prices. The relationship between the solar multiple and the thermal energy storage size remains consistent with the previous PECD version (see Table 2.7). The model has been recalibrated using updated climate data.

...

TES (hours)

SM

0

1.5

3

1.75

6

2.0

9

2.5

12

2.9

18 

3.0



Hydro Power Conversion Model
Anchor
Section2_9_4
Section2_9_4

For the historical stream, the goal for the Hydropower (HP) model is to reproduce the hydropower energy indicators starting from climate data, reconstructing their time series for the historical period (1979-2022).

...

The starting point of the work is the publicly available generation data (in MW) that can be accessed through the ENTSO-E Transparency Platform (TP) with which the model has been trained and validated to produce the results up to December 2021. The data include hydropower generation timeseries (at a resolution of 15 min, 30 min, or 1 hour depending on the country), Installed Capacity time series (annual), and Stored Energy (SE) time series to reservoirs (also referred to as ‘Filling Rates’) and pumped storage (at weekly resolution). Since these data are not sufficient to yield a complete dataset for simulations, two additional sources have been employed: (1) data provided directly by TSOs and (2) inflow data from the previous PECDv3.1 (see Table 2.11 for more details).  The three sources were ranked following data reliability in accordance with ENTSO-E: in particular, TSOs' data are accounted for as the most reliable and are ranked with the highest priority. This data includes generation and pumping timeseries at hourly resolution and NUT0 or PECD granularity. Some TSOs provided timeseries of stored energy for their countries at weekly resolution for reservoir and open-loop pumped storage technologies, which were used to estimate inflows for such technologies (see countries citing ‘TSO’ as a source under inflow columns HRI and HOL, Table 2.11). Additionally, some countries provided monthly timeseries of Installed Capacity (IC), which were useful to account for significant changes in generation due to new installations throughout the historical time series (this information was used for countries citing ‘rescaled using monthly IC’, Table 2.11). 

Where TSO timeseries are not complete, TP data are used, with some exceptions (see section Estimating Inflows). Finally, PECDv3.1 data have been employed where TSO and TP data are not sufficient. Especially, they help in completing the open-loop pumped storage inflow data, since only a few TSOs are able to share stored energy timeseries for this technology.

...

The following sections describe the statistical model, the pre-processing of input data, the validation procedure, and the use of the model to reconstruct historical data and estimate future projections. Finally, the last section describes the adopted methodology to estimate the inflows starting from the available data.

The Statistical Model

The statistical model here adopted is the Random Forest Regression model (Pedregosa et al., 2011; hereafter, the RF model), a machine learning model based on ensemble learning, which already proved to work well at such a resolution and broad domain in a previous study by Ho et al. (2020). In a preliminary comparison, at the first stages of the project, the model also proved a comparable performance over France for both HRE (Reservoirs) and HRO (Run-of-river) technologies with respect to a Neural Network fed by discharge data (a model employed in the current PECD). 

The Random Forest takes as input the generation (or inflow) data, namely the target variable, and some climate datasets covering the same time period, the predictors, and trains a large number of decision trees to predict the target variable starting from the predictors. In the end, it averages the answers from all the trees to obtain the model prediction. The number of trees in the ‘forest’, and their characteristics, can be adjusted by tuning several parameters.

Energy data pre-processing

Regarding the pre-processing of energy data, the hydropower generation, Installed Capacity and Stored Energy time series are extracted from a larger database for each PECD country and re-organized in multiple CSV files. Similarly, also TSO and PECDv3.1 data are organized into analogous CSV files. Where needed, the generation data is resampled to 1h. A weekly aggregation follows and consists of a sum of the hourly values for those weeks where at least 80% of data are available. If this holds true, the gaps in hourly values are filled by a simple interpolation. If the week presents >20% of missing values, the whole week is set to NaN. Specific checks are also made for the first values of the timeseries, as they are often unphysical, in which case they are adjusted based on adjacent values or set to NaN.

...

Finally, while the generation can be directly employed as a predictor of the RF model, the inflows must first be estimated starting from the available data (see section Estimating Inflows) and then modelled.

Climate data preprocessing

For the purposes of this application, the most informative variables that can be found in all climate datasets are 2-m temperature (TA [K]) and total precipitation (TP [m])

...

, which are commonly fed to hydrological models to compute river discharge. In particular, the two variables are useful if averaged (for TA) and cumulated (for TP) over multiple weeks preceding the time of the estimation of the generation or inflow. It is important, for instance, to consider the time lag between a precipitation event over a given area, and the corresponding discharge water reaching the hydropower plants downstream. Therefore, precipitation is cumulated over up to 30 weeks, while temperature is averaged over up to 15 weeks. According to the example of Table 2.8, if the model is used to estimate the HP generation produced for the week of 2015-01-05, it will take as predictors the TA and TP for that same week, as well as the average TA of the previous 2, 3, 4, …, 15 weeks, and the cumulated over the previous 2, 3, 4, …, 30 weeks.

...

The datasets are aggregated at weekly resolution (summing precipitation and averaging temperature) and then the lags up to 30 weeks are calculated, meaning that values are cumulated (summed/averaged) over multiple weeks to yield several more datasets, which will be used as predictors for the RF model. At the end of this pre-processing step, one CSV file per country and climate dataset is produced.

Footnotes Display

Model validation: Leave-One-Year-Out Validation

The model is validated separately for each SZON region and indicator, over the period of energy data availability (within 2015-2022 in case of TP data, 2010-2022 in case of TSO data, 2010-2017 in case of PECDv3.1 data). The validation procedure followed is the Leave-One-Year-Out (LOYO), which trains the model over all N available years except one (test year), and evaluate the model performance over this test year. This is repeated N times, keeping one year as the test year, until the complete estimated time series can be assembled (see Figure 2.31).

Anchor
Figure2_31
Figure2_31

...

For instance, in the case of the modelled time series in Figure 2.31, the NSE value is 0.59 (as also reported in the upper left corner of the figure). The metric is calculated as one minus the ratio between the variance of the modelled timeseries and the variance of the observed timeseries. If there is no difference between the modelled (m) values and the observed (o) ones at each timestep (i), then the NSE will be 1 (perfect fit), which is the maximum value that can be reached. On the other hand, if there are significant differences between the two timeseries, the NSE can reach negative values (up to -Inf). An NSE = 0 would indicate that the model has the same predictive skill as the mean of the timeseries in terms of the sum of the squared error.

RF Model Parameters

As mentioned, the Random Forest can be built by specifying several parameters. The main parameters indicated in Table 2.9 have been tuned country by country and indicator by indicator. This has been done by sampling a hyperparameter space with the Latin Hypercube Sampling algorithm to find the set able to optimize a selected metric. The hyperparameter space has been defined by assigning a range of values to each of the main RF parameters. To efficiently sample this multidimensional domain, a Latin Hypercube Sampling of 1000 samples has been performed and each sampled set of parameters has been tested via LOYO procedure to yield the score of the chosen metric. Finally, the set of parameters yielding the best score was retained and used for that specific country and indicator. 

...

. However, this metric requires longer computational times and, in a few cases, brings unphysical results. Therefore, the proposed results are obtained with RF parameters optimized using NSE.

Footnotes Display

Model Validation Results

To summarize the validation results, a map displaying the NSE scores obtained for each country is visible in Figure 2.32 for the generation and inflow to reservoirs, the inflows to run-of-river, the inflows to pondage, and the inflows to open loops. Generally, over the PECD domain, the results are satisfactory, with fairly high NSE values for most countries. This is especially seen for the inflow to reservoirs indicator (panel b), which assimilates information on the reservoirs filling rates (for the countries that provide it) and hence is able to reduce the human influence on the generation signal, while generation signal without this information can be harder to reproduce with a model based on temperature and precipitation alone (see panel a). High scores are obtained also for inflows to run-of-river and pondage (panels c and d), where the signal has a more distinct seasonality and is less influenced by human intervention. The scores are generally lower for inflows to open-loop (panel e), largely based on PECDv3.1 data.

...

Figure 2.32: maps of the LOYO validation results obtained in terms of NSE over the period of available data which depends on the source (TSO: 2010-2022, TP: 2015-2022, PECDv3.1: 2010-2017). The four panels each refer to a different inflow (or generation) indicator, as reported in the panels’ titles.

Modelling Historical stream

Once the model is validated, it is trained (again for each country and indicator) on all available years of generation data using the tailored sets of parameters found during the optimization procedure. The same parameters are then used to extend the HP indicator back to 1950, to have long reconstructed time series, using the ERA5 temperature and precipitation data. Figure 2.33 shows an example of a historical time series of inflow to reservoirs as estimated by the RF model for France (in blue). It also shows the ‘observed’ inflow series in grey, estimated with TP data (see section Estimating Inflows).

...

Figure 2.33: RF-reconstructed time series of inflow to reservoirs (HRI) for France (FR). The estimated series is shown in blue, while the observations (2015-2022) are in grey, starting from the dashed line.

Estimating Inflows

The RF model produces generation timeseries, although artificial regulations can significantly impact the timeseries and affect its seasonality, jeopardizing the capability of Temperature and Precipitation to reliably reproduce said signal. This issue regards specific technologies involving a reservoir, especially Reservoir and Open-Loop pumped storage systems, while the effect can be in general neglected for run-of-river plants and pondage plants, which are run-of-river plants making use of a limited storage capacity amounting to no more than 24 hours.

...

This roundtrip efficiency usually depends on the design of the plant. For older designs it may be lower than 60%, while for recent ones it can be up to 90%. The suggested efficiency from ENTSO-E is 0.75, so we’ll assume this to be the reference value over Europe. As seen in Figure 2.35 for a French Closed-Loop unit, the balance holds as the production and pumping terms are cumulated over time and the natural inflow remains null.

Inflow to Open-loop Pumping

The situation for open-loop facilities, which is sketched on Figure 2.36, is different since the natural inflow component isn’t null, and therefore constitutes a third unknown, together with the two efficiencies. The assumption that one can make is to consider the pumping and production efficiencies as equal (

...

Figure 2.36: An approximated sketch of an Open-loop system.

Inflow to Reservoirs

As for reservoirs, the pumping component is null, so the equation reduces to:

...

It must be noted that TP data has been cautiously used to compute inflow to reservoirs, since the stored energy data on the platform refers both to reservoirs and pumped storage technologies. Hence, the inflow results from the TP have been retained only in a few cases, generally where the reported installed capacity for reservoirs is much greater than the one for pumped storage.

Inflow to Run-of-rivers and Pondage

For Run-of-river systems, the storage term is considered null, and considering that the storage capacity of a pondage is less than 24 hours, the same is assumed for run-of-river with pondage at weekly resolution, hence reducing the equation to:

...

When possible, the two technologies are kept separate. For instance, this is possible for the bidding zones whose TSO provided distinct generation time series. Data from the TP, on the other hand, are used to model run-of-river technology only in case no pondage was declared for that bidding zone by the TSO, nor was pondage available in the PECDv3.1 dataset. This to make sure that the sole run-of-river was being addressed, given the TP generation data includes both technologies (addressed as ‘Run-of-river and pondage’). If only run-of-river data were provided by the TSO for a given bidding zone, the run-of-river inflow was calculated starting from this data, while the pondage inflow was calculated starting from the PECDv3.1 data. Comments on these particular cases are left in the Summary Table (Table 2.11).

Finally, the same production efficiency is assumed for all technologies (

...

), however, to align with the models used by ENTSO-E to ingest the energy data, the final inflow model outputs are multiplied back by the same efficiency coefficient to obtain an inflow at the electrical grid level. Although the balance equations should bring to close-to-reality estimates, it must be noted that not having access to actual inflow observations, it is not possible to fully validate the above methodology.

Use of PECDv3.1 inflow estimates

In case TSO and TP data were not sufficient to complete the inflow for a specific bidding zone and a specific technology, the PECDv3.1 inflow data were used directly as the target variable for the training of the RF model as indicated in Figure 2.37. This approach was especially used to model inflows to open-loop pumped storage as only a few stored energy time series were provided by the TSOs. Therefore, there are cases in which the generation is modelled starting from available TSO data, while the corresponding inflow (for the unavailability of stored energy data) is modelled starting from PECDv3.1 data, bringing up sometimes inconsistencies between the two datasets. The main ones are reported in the Summary Table (Table 2.11).

Anchor
Figure2_37
Figure2_37

Figure 2.37: Sketch of the two different approaches to model inflows: approach 1 makes use of TSO and TP data, approach 2 makes use of PECDv3.1 data.

Post-hoc corrections following TSOs’ feedback

For the produced inflow datasets of some specific technologies and regions, a multiplicative correction factor was applied to the model outputs in agreement with the TSO of interest, after validation against a reference dataset. These correction factors were hence required due to the poor quality of the public data initially used for the model training and are to be regarded as temporary adjustments ahead of a more stable solution. See Table 2.10 for an overview of the explicit multiplicative values, and the regions to which these were applied for the PECDv4.2 delivery of data.

...

Region

Technology

Correction Factor

Source

AT00

HRI – inflows to reservoirs

2404/5507

Comparison of mean maximum generation with an internal APG data source with strict sharing limitations.

HRR – inflows to run of river

23082/17760

HPI – inflows to pondage

5607/4506

CH00

HOL – inflows to open-loop pumped storage

0.825

Comparison of mean annual cumulated inflows with a reference monthly dataset derived from Swiss Federal Office of Energy (SFOE) data.

HRR – inflows to run of river

1.39

Comparison of mean annual cumulated inflows with a reference monthly dataset (SFOE). Mind: this factor was applied directly to the model input TSO data in accordance with the Swiss TSO.

TR00

HRR – inflows to run of river
(and relative IC series)

2.502

Comparison of mean annual cumulated inflow with an internal series of annual cumulated generation for period 2019-2023 including all country plants.

HRI – inflows to reservoirs
(and relative IC series)

1.850

Comparison of mean annual cumulated inflow with an internal series of annual cumulated generation for period 2019-2023 including all country plants.

Summary Table

Table 2.11 includes all addressed bidding zones and technologies (except for generation from run-of-river and pondage, which would be a repetition of the respective reported inflow columns) and can be used to check the availability of data, source of data used for the modelling, and comments on the results, mainly addressing inconsistencies found or considerations made for the source/modelling choices. As mentioned, the TSO generation data have always been given priority when available, followed by TP data and PECDv3.1 estimates. Given the different data sources and methodology used, the results can significantly differ from the ones of the previous PECD, therefore we strongly recommend checking with TSOs about the reliability of mean generation/inflow historical values.

...


Reservoirs Generation

Inflow to Reservoirs

run-of-river Inflow

Inflow to Open Loop PS

Pondage Inflow

Bidding zone / Tech.

HRG

HRI

HRO

HOL

HPO

AL00

TSO rescaled using monthly IC

TSO rescaled using monthly IC

TSO rescaled using monthly IC



AT00

TSO

TP – the mean using PECDv3.1 data is too low with respect to TSO data, hence using TP data although SE is surely affected by HPS (Hydro Pumped Storage)

TSO

PECDv3.1

TSO

BA00

TSO

PECDv3.1

PECDv3.1- TSO run-of-river data not provided – might be already accounted for in TSO pondage data

PECDv3.1

TSO

BE00



TSO



BG00

TSO

TSO

TSO

TSO


CH00

TSO

TSO – rescaled using monthly IC

TSO - rescaled using monthly IC – multiplication factor of 1.39 applied to generation input data in accordance with CH00 TSO

TSO - rescaled using monthly IC


CZ00

TSO

PECDv3.1

TP (since there’s no pondage) – can reproduce mean signal, can’t well reproduce the peaks – suspected anthropic factors influencing the production after 2019

PECDv3.1


DE00

TSO

PECDv3.1 – mean too low with respect to TSO generation, should be ca three times higher

TSO

PECDv3.1


ES00

TSO

TSO

TSO

TSO


FI00

TSO

TSO

TP (no TSO pondage data, no PECDv3.1 pondage data)



FR00

TP

TP – HPS (pumped storage) IC about 60% of HRE (reservoirs) IC in past 8 years (from TP data) + time series very close to PECDv3.1 inflow

TP (no TSO data for FR, no pondage in PECDv3.1 data)

GPU (Generation Per Unit) - (no PECDv3.1 data for FR) - low reliability: no HOL storage energy available (approximated inflow assuming negligible storage from one week to the other) + few production and pumping data (3 years)


GR00

TSO

TSO

TSO – model training on last 4 years (missing monthly IC data to rescale) – significant difference with PECDv3.1 inflow

TSO

PECDv3.1 – even though no pondage data from TSO nor TP

HR00

TSO – very close to TP generation

TP – HPS IC about 20% of HRE IC in the past 9 years (TP data)

TSO – could contain pondage

PECDv3.1

PECDv3.1 – even though no pondage data from TSO.

HU00



TSO rescaled using monthly IC



IE00



TSO



ITCA

TSO

PECDv3.1 – reasonable values with respect to TSO generation

TSO



ITCN

TSO

PECDv3.1 – inflow sometimes lower than TSO generation

TSO



ITCS

TSO

PECDv3.1 – inflow very close to TSO generation

TSO

PECDv3.1


ITN1

TSO

PECDv3.1 – inflow very close to TSO generation

TSO

 PECDv3.1


ITS1

TSO

PECDv3.1 – inflow close to generation (would expect it a bit higher)




ITSA

TSO

PECDv3.1 – high with respect to TSO generation

TSO



ITSI

TSO

PECDv3.1 – low peaks with respect to TSO generation

TSO

PECDv3.1


LT00



TSO – generation values exceptionally high for the year 2015 (something wrong in the data) -> left out of training



LV00





TSO

LU00



TSO



ME00

TSO – close to tp generation data, higher peaks

TP – no HPS IC

PECDv3.1



MK00

TSO

TSO




NL00



 PECDv3.1



NOM1

TSO

TP – small HPS production compared to HRE

TSO

PECDv3.1


NON1

TSO

TP - no HPS

TSO



NOS1

TSO

TP – no HPS

TSO

-


NOS2

TSO

TP – trying splitting PECDv3.1 NOS0 data obtained similar result + small HPS production

TSO

PECDv3.1 (splitting PECDv3.1 NOS0 data according to mean TSO generation data for NOS2)


NOS3

TSO

TP - trying splitting PECDv3.1 NOS0 data obtained similar result + small HPS production

TSO

PECDv3.1 (splitting PECDv3.1 NOS0 data according to mean TSO generation data for NOS3)


PL00

TSO

PECDv3.1 – mean inflow value is 3-4 times higher than TSO generation (also TP-calculated mean is 3-4 times higher)

TSO - rescaled using monthly IC

PECDv3.1 – inflow seems to be too low considering TSO generation and pumping series: ca 200 MWh of inflow against 1200 MWh of generation (mean weekly values)


PT00

TSO

TSO

TSO – values seem low, tp and PECDv3.1 data ca 10 times higher than TSO data of run-of-river and HPO together

TSO - rescaled using monthly IC

TSO

RO00

TSO

PECDv3.1

TSO

PECDv3.1


RS00

TSO

PECDv3.1 – TP data significantly impacted by HPS

TSO



SE01

TSO

PECDv3.1




SE02

TSO

PECDv3.1




SE03

TSO

PECDv3.1




SE04

TSO

PECDv3.1




SI00

TSO

-

TSO – could contain pondage


PECDv3.1 – no pondage generation data from TSO: keeping PECDv3.1 trained estimates. Pondage could be included in run-of-river TSO data? In this case PECDv3.1 estimates are off.

SK00

TSO

PECDv3.1 – although mean is considerably higher than TSO generation

TSO

PECDv3.1

TSO

TR00






UK00



TP – (no TSO data for GB, no pondage in PECDv3.1 data)





Energy indicators
Anchor
Section2_10
Section2_10

Energy indicators included in the PECDv4.2 dataset for the historical stream are described in Table 2.12. This table provides information for each variable, including the typology, the time period covered, the source of the input data, the domain and spatial resolution, the temporal resolution, the spatial aggregation (as specified in Table 2.1), and, where applicable, the different technologies used to compute the final time series.

...

***Inflow data from ENTSO-E PECDv3.1

Known issues

There are no known issues.

Projection stream

Projection models
Anchor
Section3_1
Section3_1

Choice of models

The projection dataset in PECDv4.2 has been designed to provide robust climate and energy indicators for the entire PECD domain, extending up to the year 2100. As a first step in building this dataset, a careful selection of climate projections was carried out to identify the most appropriate subset for energy-sector applications. 

...

Rather than relying on complex performance-based metrics to evaluate how well each model reproduces historical climate conditions, the selection was primarily guided by Equilibrium Climate Sensitivity (ECS) values. This criterion, also used in IPCC AR6, allows for selecting a representative ensemble that spans the range of projected climate sensitivity, including models with higher sensitivity to capture "low-likelihood, high-impact" futures. The selection also aimed to reduce redundancy by minimising model overlap (i.e., models developed with similar components or structures). The results are presented in Table 3.1.

Anchor
Table3_1
Table3_1

Table 3.1: Models are colour-coded based on exclusion criteria: dark red indicates models that do not provide all required scenarios; orange highlights models with Equilibrium Climate Sensitivity (ECS) values outside the range assessed in the IPCC AR6; yellow marks models that share components with others in the ensemble. The models that were retained for PECDv4.2 are highlighted in bold.

...

The final selection of models and their characteristics is reported in Table 3.2.

Anchor
Table3_2
Table3_2

Table 3.2: CMIP6 models used in the projections stream and their corresponding characteristics and nodes for downloading.
The models and scenarios indicated in bold are the ones that have been introduced in PECDv4.2, while the other ones were already present in PECDv4.1.

...

Note that the historical simulation period is chosen to ensure overlap between ERA5 and the CMIP6 models, enabling the computation of bias adjustment.

 

Footnotes Display

Data retrieval

CMIP6 variables (for each model) are downloaded from the ESGF node using a Python script that utilises a specific Python API. The script only accepts a configuration file as an argument, which contains the desired tags for the download. This script is used for downloading both historical and projection data. Table 3.2 lists the nodes from which each model has been downloaded. The selected CMIP6 climate models are also available in the C3S catalogue, however the high temporal resolution (namely, 3 hourly) needed to produce the PECD database was not available at the C3S. For this reason, the CMIP6 model output have been collected via the ESGF nodes.

Footnotes Display

Spatial interpolation
Anchor
Section3_3
Section3_3

Starting from a common 100 km nominal spatial resolution and global domain, each model has its own grid, necessitating spatial interpolation to the PECD domain at 0.25° x 0.25°. This interpolation uses the bilinear method as implemented in the CDO

...

A Python script iterates over the files and, using the os library, calls the CDO command line for each file. Another Python script in the pre-processing pipeline checks the output files for missing (Not A Number, NaN) and anomalous values, and reformats them according to ERA5 conventions.

Footnotes Display

Temporal aggregation and interpolation
Anchor
Section3_4
Section3_4

As stated in Section 3.1, one of the selection criteria for projection models is the finest available temporal resolution (3 hours). However, it is necessary to apply temporal interpolation to achieve the required hourly resolution for the PECDv4.2 database. Table 3.3 shows the method used to temporally interpolate each variable.

...

It is important to note that to obtain files according to the ERA5 conventions and to have the first hour as 00:00 for the projections, it is necessary to use the last day of the historical scenario, considering that the different SSP scenarios start from 03:00. Figure 3.1 contains a validation of this method considering the TA variable at a generic point of the PECD domain.

...

The SG2 library can be installed via "pip" in any Python environment. The detrended Kt time series is then downscaled to an hourly resolution using linear interpolation. The data is subsequently reconverted to GHI by multiplying it with an hourly-averaged TOA value. Figure 3.2 shows a validation plot for this procedure, computed at a generic point within the PECD domain.

...

The required variable for precipitation is total precipitation (TP), which has been derived from the precipitation flux (in kg m⁻² s⁻¹), the original data format for CMIP6 projection models. Since energy models require daily cumulative data, the downloaded precipitation flux data was first resampled to daily averages using the xarray.resample().mean() method. This daily average was then multiplied by 86.4 to convert the data into daily precipitation in meters. 

Footnotes Display

Bias-adjustment procedure
Anchor
Section3_5
Section3_5

Concerning the projection streamflow, two bias adjustment methodologies have been implemented for the CMIP6 projection datasets. These methodologies are:

...

CDFt Method: This method is used for variables with a strong climate-change-related trend, such as temperature. To correctly account for the trend, a 20-year time series is considered for the calculation of the CDFs, with only the central 10-year window taken as the adjusted data. The 20-year timeframe is then moved forward, yielding a new 10-year central window that partially overlaps the window of the previous step. Despite wind speed and precipitation (WS10 and TP) not exhibiting a strong climate change trend, their correction is also based on the CDFt method. This is because the mean factors in the Delta method could potentially lead to negative (and therefore unphysical) values. For these variables, given the lack of a strong climatic trend, the CDFt considers a ‘static’ 20-year time series.

Figures 3.3, 3.4 and 3.5 illustrate the logic blocks of the bias-adjustment procedure applied to the 2m temperature (TA), total daily precipitation (TP), 10 m wind speed (WS10), and to the surface solar radiation (GHI), respectively.

...

 Figure 3.5: Details of the bias-adjustment logic block for the projection global horizontal irradiance (GHI) using the Delta Adjustment method.

Climate indicators

Table 3.4 lists the climate indicators for the projection stream. The final domain and spatial resolution, as well as the final temporal resolution, are obtained through preprocessing as described in Section 3.3 and Section 3.4, respectively. The bias adjustment has been applied using the procedures detailed in Section 3.5. Since wind speed at 100 m above the ground is not available for the CMIP6 projection models, and to maintain consistency between the wind speed at 100 m in the historical (ERA5) and the projection datasets, the wind speed at 100 m is calculated using the near-surface (10m) wind speed of the CMIP6 projection models together withthe Alpha Coefficient (or power law) derived from the ERA5 reanalysis (see Section 2.2 for more details).For the projection stream, the computation of TAW and the spatial aggregation follow the same methodologies described for the historical stream (see Sections 2.4 and 2.5, respectively). It is important to note that all variables are bias-adjusted except for TAW and WS100, because they are both derived from bias-adjusted variables (TA and WS10, respectively).   

Anchor
Table3_4
Table3_4

Table 3.4
: Climate indicators provided in the PECDv4.2 for the projection stream. Files provided at the BIAS  spatial aggregation level (specifically, bias-adjusted data; see Table 2.1 for further info) are gridded (NetCDF format), while all the other levels of aggregation are provided in a CSV format. Changes that were implemented in PECDv4.2 are highlighted in bold (extended time period, additional climate projection models and climate scenarios, new spatial aggregation over selected cities). 

VariablePeriodSourceModelsScenarioDomain/ spatial resolutionTemporal resolutionSpatial aggregationUnits
2m temperature (TA)2015-2100CMIP6 projectionsAWCM, BCCS, CMR5, ECE3, MEHR, MRM2SSP126, SSP245, SSP370, SSP585PECD/0.25° x 0.25°hourlyBIAS, NUT0, NUT2, SZON, SZOF, PEON, PEOF, CITY

K (gridded)

°C (aggregated)

Population-weighted temperature (TAW)2015-2100CMIP6 projectionsAWCM, BCCS, CMR5, ECE3, MEHR, MRM2SSP126, SSP245, SSP370, SSP585PECD/0.25° x 0.25°hourlySZON°C
Total precipitation (TP)2015-2100CMIP6 projectionsAWCM, BCCS, CMR5, ECE3, MEHR, MRM2SSP126, SSP245, SSP370, SSP585PECD/0.25° x 0.25°dailyBIAS, NUT0, NUT2, SZON, SZOF, PEON, PEOFm
Surface solar radiation downwards (GHI)2015-2100CMIP6 projectionsAWCM, BCCS, CMR5, ECE3, MEHR, MRM2SSP126, SSP245, SSP370, SSP585PECD/0.25° x 0.25°hourlyBIAS, NUT0, NUT2, SZON, SZOF, PEON, PEOFW m-2
10m wind speed (WS10)2015-2100CMIP6 projectionsAWCM, BCCS, CMR5, ECE3, MEHR, MRM2SSP126, SSP245, SSP370, SSP585PECD/0.25° x 0.25°hourlyBIAS, BIAS, NUT0, NUT2, SZON, SZOF, PEON, PEOFm s-1
100m wind speed (WS100)2015-2100CMIP6 projectionsAWCM, BCCS, CMR5, ECE3, MEHR, MRM2SSP126, SSP245, SSP370, SSP585PECD/0.25° x 0.25°hourlyBIAS, NUT0, NUT2, SZON, SZOF, PEON, PEOFm s-1

  

Energy data

The same data illustrated in Section 2.7 are also used for the projection stream. 

Energy Conversion models

Wind Power Conversion Model

The climate data used as input are listed in Table 3.4, and the procedure is the same as described in Section 2.9.1

The simulated locations and wind technologies depend on the type of run. An overview of the runs is given in Table 3.5.

Anchor
Table3_5
Table3_5

Table
3.5: Wind run types for the projection stream. Changes that were implemented in PECDv4.2 are highlighted in bold (specifically, the extended time period). 

Run type

Climate projection simulated years

WPP locations

WPP technology

Losses

Existing

2015-2100

All years with 2020 WPP locations (based on WindPowerNet data)

Existing WPP parameters based on WindPowerNet data (always 2020 fleet), applied in the generic power curve model

Wakes as part of the generic power curve. And 10 % for other losses (incl. unavailability), applied as a simple multiplication by 0.9

Future wind technologies

2015-2100

The best 10-50 % locations of the unmasked points within each PECD region (in terms of mean wind speed in the bias-adjusted ERA5 data, based on ERA5 grid).

Onshore wind: 3 hub heights and 3 turbine types, so in total 9 wind technologies. A plant of 50 MW with ten 5 MW turbines modelled for each technology.

Offshore wind: 1 hub height and 2 turbine types, so in total 2 wind technologies. A plant of 500 MW with 28 18 MW turbines modelled for each technology.

Wakes as part of power curves. And 5 % for other losses (incl. unavailability), applied as a simple multiplication by 0.95


Photovoltaic Solar Power Conversion Model

The climate data used as input are listed in Table 3.4, and the procedure is the same as described in Section 2.9.2

Concentrated Solar Power Conversion Model

The climate data used as input are listed in Table 3.4, and the procedure is the same as described in Section 2.9.3

Hydro Power Conversion Model

The climate data used as input are listed in Table 3.4, and the procedure is the same as described in Section 2.9.4

Energy indicators

For the projection stream, the same energy indicators described for the historical stream (see Section 2.10) werecomputed starting from the climate indicators listed in Table 3.4.

Table 3.6 summarizes the energy indicators and provides detailed information for each variable, including the type, the covered time period, the source of the input data, the domain and spatial resolution, the temporal resolution, the spatial aggregation (according to Table 2.1), and, where applicable, the different technologies used to compute the final time series.

...

**Inflow data from ENTSO-E PECDv3.1

Appendix
Anchor
Appendix
Appendix

Filenames convention and characteristics

This paragraph aims to explain the filename convention of the PECD datasets. Table 4.1 details the structure and possible fields of the filenames. Specifically, the last column indicates the corresponding section of the CDS catalogue where users can personalize their choice. If "Not applicable" is indicated, it means that the user cannot modify this field, and the data are downloaded with fixed characteristics that are not customizable. Table 4.2 details the structure and filenames of the ancillary NetCDF files that have been used for PECDv4.2 and that are available in the CDS under the widget "Weights and masks".

...

FilenameVariableGridDescriptionCorresponding name in the widget "Weights and masks"

ANCI_CITY-coords_PECD4.2_fv1.csv

--List of cities and their corresponding coordinates. See Section 2.5.1 for more details.City coordinates
ANCI_LAT-mask_PECD4.2_fv1.nclat_weights(latitude, longitude)PECD domain (latitude, longitude)Each grid cell contains the cosine of the latitude for the corresponding grid cell. See Section 2.5.3 for more details.Latitude weights
ANCI_SZON-mask_PECD4.2_fv1.ncmask(region, latitude, longitude)

PECD domain (latitude, longitude)

level (region) 

For each level (region in SZON), every grid cell contains a floating point value between 0 and 1. A value of 0 indicates that the grid cell is outside the region, while a value of 1 means the cell is fully within the region. In other cases, the value represents the fraction of the grid cell’s area that lies within the region. See Section 2.5.2 for more details.

SZON regions mask

ANCI_SZOF-mask_PECD4.2_fv1.ncmask(region, latitude, longitude)

PECD domain (latitude, longitude)

level (region) 

For each level (region in SZOF), every grid cell contains a floating point value between 0 and 1. A value of 0 indicates that the grid cell is outside the region, while a value of 1 means the cell is fully within the region. In other cases, the value represents the fraction of the grid cell’s area that lies within the region. See Section 2.5.2 for more details.SZOF regions mask
ANCI_PEON-mask_PECD4.2_fv1.ncmask(region, latitude, longitude)

PECD domain (latitude, longitude)

level (region) 

For each level (region in PEON), every grid cell contains a floating point value between 0 and 1. A value of 0 indicates that the grid cell is outside the region, while a value of 1 means the cell is fully within the region. In other cases, the value represents the fraction of the grid cell’s area that lies within the region. See Section 2.5.2 for more details.PEON regions mask
ANCI_PEOF-mask_PECD4.2_fv1.ncmask(region, latitude, longitude)

PECD domain (latitude, longitude)

level (region) 

For each level (region in PEOF), every grid cell contains a floating point value between 0 and 1. A value of 0 indicates that the grid cell is outside the region, while a value of 1 means the cell is fully within the region. In other cases, the value represents the fraction of the grid cell’s area that lies within the region. See Section 2.5.2 for more details.PEOF regions mask
ANCI_NUT0-mask_PECD4.2_fv1.ncmask(region, latitude, longitude)

PECD domain (latitude, longitude)

level (region) 

For each level (region in NUT0), every grid cell contains a floating point value between 0 and 1. A value of 0 indicates that the grid cell is outside the region, while a value of 1 means the cell is fully within the region. In other cases, the value represents the fraction of the grid cell’s area that lies within the region. See Section 2.5.2 for more details.NUTS 0 regions mask
ANCI_NUT2-mask_PECD4.2_fv1.ncmask(region, latitude, longitude)

PECD domain (latitude, longitude)

level (region) 

For each level (region in NUT2), every grid cell contains a floating point value between 0 and 1. A value of 0 indicates that the grid cell is outside the region, while a value of 1 means the cell is fully within the region. In other cases, the value represents the fraction of the grid cell’s area that lies within the region. See Section 2.5.2 for more details.NUTS 2 regions mask
ANCI_WPM-mask_PECD4.2_fv1.ncm_rest(latitude, longitude)PECD domain (latitude, longitude)Each grid cell contains a boolean value: 1 indicates that the cell is unsuitable for potential future wind power installations, while 0 indicates that the cell could potentially be used as a site for such installations. See Section 2.8 for more details.Wind power regions mask
ANCI_PVM-mask_PECD4.2_fv1.ncPVmask(latitude, longitude)PECD domain (latitude, longitude)Each grid cell contains a boolean value: 1 indicates that the cell is unsuitable for potential future solar photovoltaic power installations, while 0 indicates that the cell could potentially be used as a site for such installations. See Section 2.8 for more details.Solar PV mask
ANCI_ALP-coef_PECD4.2_fv1.ncalpha(time, latitude, longitude)

PECD domain (latitude, longitude)

levels (time)

For each level (time), every grid cell contains the power law's alpha coefficient. Each grid cell contains in total 12*24 alpha coefficients, one for each month of the year and each hour of the day. See Section 2.2.1 for more details.

Power law coefficients

ANCI_POP-mask_PECD4.2_fv1.ncpopulation_mask(latitude, longitude)PECD domain (latitude, longitude)Each grid cell contains the number of people living in that area. See Section 2.4.1 for more details.Population density mask
ANCI_WS10G2-mean_PECD4.2_fv1.ncws10(latitude, longitude)PECD domain (latitude, longitude)Each grid cell contains the mean value of the 10 m wind speed from GWA2 computed over the period 2006-2018. This file is used in the bias adjustment procedure described in Section 2.3.3.Climatology of GWA2 10 m wind speed
ANCI_WS10E5-mean_PECD4.2_fv1.ncws10(latitude, longitude)PECD domain (latitude, longitude)Each grid cell contains the mean value of the 10 m wind speed from ERA5 computed over the period 2006-2018. This file is used in the bias adjustment procedure described in Section 2.3.3.Climatology of ERA5 10 m wind speed
ANCI_WS100G2-mean_PECD4.2_fv1.ncws100(latitude, longitude)PECD domain (latitude, longitude)Each grid cell contains the mean value of the 100 m wind speed from GWA2 computed over the period 2006-2018. This file is used in the bias adjustment procedure described in Section 2.3.3.Climatology of GWA2 100 m wind speed
ANCI_WS100E5-mean_PECD4.2_fv1.ncws100(latitude, longitude)PECD domain (latitude, longitude)Each grid cell contains the mean value of the 10 m wind speed from GWA2 computed over the period 2006-2018. This file is used in the bias adjustment procedure described in Section 2.3.3.Climatology of ERA5 100 m wind speed


Metadata

The header of CSV files contains the following metadata descriptors. Below, an example is presented for the 2m air temperature variable:

...

### The original data sources are ECMWF ERA5 Reanalysis (available at: https://cds.climate.copernicus.eu

How to cite the data*

Please refer to the "References" section on the catalogue entry page of this dataset in the Climate Data Store (CDS) as it provides the DOI number as well as details on dataset citation and attribution. 

References

Beyer, H. G., Heilscher, G., and Bofinger, S.: "A robust model for the MPP performance of different types of PV-modules applied for the performance check of grid connected systems”, EuroSun 2004 conference; pp. 3064-3071, Germany, June 2004.

...

Info
iconfalse

This document has been produced in the context of the Copernicus Climate Change Service (C3S).

The activities leading to these results have been contracted by the European Centre for Medium-Range Weather Forecasts, operator of C3S on behalf of the European Union (Delegation Agreement signed on 11/11/2014 and Contribution Agreement signed on 22/07/2021). All information in this document is provided "as is" and no guarantee or warranty is given that the information is fit for any particular purpose.

The users thereof use the information at their sole risk and liability. For the avoidance of all doubt, the European Commission and the European Centre for Medium - Range Weather Forecasts have no liability in respect of this document, which is merely representing the author's view.

Related articles

Content by Label
showLabelsfalse
max5
spacesCKB
showSpacefalse
sortmodified
reversetrue
typepage
cqllabel in ("sis","sis-energy") and type = "page" and space = "CKB"
labels era-interim