...
Info | ||||
---|---|---|---|---|
| ||||
|
History of modifications
Expand | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
| ||||||||||||
|
List of datasets covered by this document
Expand | ||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|
| ||||||||||||
|
Acronyms and abbreviations
Anchor | ||||
---|---|---|---|---|
|
Expand | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
|
Introduction
This document presents the technical methodologies and implementation details of the climate and energy indicators included in the Pan-European Climate Database version 4.2 (PECDv4.2). Developed under the Copernicus Climate Change Service (C3S) Energy service, PECDv4.2 has been produced in close collaboration with the European Network of Transmission System Operators for Electricity (ENTSO-E).
...
Files are provided in both NetCDF and CSV formats. For details on the format used for each variable, refer to Table 2.2, Table 2.12, Table 3.4 and Table 3.6.
Descriptions of file naming conventions can be found in Table 4.1, while Table 4.2 and Table 4.3 detail the ancillary NetCDF datasets available via the "Weights and masks" widget.
Note |
---|
Please note: this documentation refers exclusively to PECDv4.2. The previous version, PECDv4.1, has been discontinued and will not be extended beyond 2021, as its datasets were frozen ahead of the 2023 European Resource Adequacy Assessment (ERAA), in agreement with ENTSO-E. |
Key updates introduced in PECDv4.2
An overview of all the changes and updates that have been implemented in PECDv4.2 compared to PECDv4.1 can be found at the following page:
Climate and energy related variables from the Pan-European Climate Database versions comparison
Workflows
The workflows form the backbone of the PECDv4.2 system, integrating all key components of the data processing chain. Separate workflows have been developed for the two data streams – historical (Figure 1.1) and projections (Figure 1.2). Each workflow covers the generation of both climate and energy indicators, which serve as the foundation for data production, monitoring, and delivery.
...
Figure 1.1: Workflow for the historical stream. All acronyms used in the workflow are listed in a dedicated section entitled "Acronyms and abbreviations" and located at the beginning of this documentation.
...
Figure 1.2: Workflow for the projection stream. All acronyms used in the workflow are listed in a dedicated section entitled "Acronyms and abbreviations" and located at the beginning of this documentation.
Historical stream
Data retrieval Anchor Section2_1 Section2_1
Section2_1 | |
Section2_1 |
The workflow illustrating the historical stream is shown in Figure 1.1. ERA5 data from the Copernicus Climate Data Store (CDS) is retrieved via the CDS API (Application Programming Interface), which requires prior installation of Python and the CDS API client. Data is downloaded in monthly chunks by specifying the desired variables and time period.
...
The historical stream of PECDv4.2 includes the following climate indicators: 2 m temperature (TA), population-weighted temperature (TAW), total precipitation (TP), surface solar radiation downwards, 10 m wind speed (WS10) and 100 m wind speed (WS100). Detailed descriptions of these indicators are provided in Section 2.6. Notably, the surface solar radiation downwards corresponds to the global horizontal irradiance (GHI) and is downloaded in hourly values in J m⁻² and converted to W m⁻² by dividing by 3600 seconds.
...
This calculation is implemented using a Python script. Additional guidance is available via the CDS documentation, e.g., ERA5: How to calculate wind speed and wind direction from u and v components of the wind?.
The power law for wind vertical extrapolation
Anchor | ||||
---|---|---|---|---|
|
Wind speed outputs from numerical weather prediction models and climate simulations are typically available at fixed vertical levels, most commonly at 10 m above ground level. For example, the CMIP6 climate projections only provide near-surface wind data. To estimate wind speeds at turbine-relevant heights (e.g., 100 m), vertical extrapolation is necessary. This is achieved using a power law, which expresses wind shear through a dimensionless coefficient known as Alpha (α).
This coefficient enables the conversion of 10 m wind speeds to other heights by accounting for localised vertical wind profiles, as represented in models like ERA5. Temporal variability in wind shear is also considered by stratifying Alpha values by time of day and month, ensuring more accurate height scaling for energy applications.
Alpha computation
Anchor | ||||
---|---|---|---|---|
|
The Alpha coefficient was derived using ERA5 wind data from the CDS, specifically the zonal (u) and meridional (v) wind components at both 10 m and 100 m heights, as described in Section 2.1.
The data span an 11-year period from 2011 to 2021, at hourly resolution. This time window was selected as it reflects the most recent and reliable observations assimilated in ERA5, and provides a statistically robust basis for calculating vertical wind shear.
...
The result is a set of Alpha values stratified across 24 hourly intervals and 12 months, capturing diurnal and seasonal variations in vertical wind shear. The final Alpha dataset is stored in NetCDF format and made available via the CDS. For more information, please refer to Table 4.2 and Table 4.3.
Alpha characterization
The diurnal and seasonal variability of the Alpha coefficient across the PECD domain is illustrated in Figure 2.1, which shows the mean Alpha value calculated for each hour and month. These results align with known atmospheric behaviour and prior studies:higher Alpha values occur during the colder, more stable night-time hours, whereas lower values are observed during the daytime, when the atmospheric boundary layer is typically well mixed. Similarly, during winter months, Alpha tends to be higher than in summer, particularly in the central (and warmer) hours of the day.
However, a more nuanced picture emerges when examining the spatial and temporal distribution of Alpha across the domain. Figure 2.2 presents box plots of Alpha values for each hour, aggregated over all grid points in the PECD domain. The plots highlight a wider interquartile range during night-time hours, indicating greater variability in wind shear under stable atmospheric conditions. Notably, the Alpha coefficient can also reach negative values (as low as -0.4), particularly at night, reflecting instances where wind speed decreases with height — a phenomenon associated with specific meteorological conditions.
...
Figure 2.2: Hourly distribution of the Alpha wind shear coefficient across the PECD domain, represented as box plots for each hour (UTC). Boxes show the interquartile range (25th–75th percentile), while whiskers and outliers highlight the spatial variability. Larger spreads at night reflect more variability under stable conditions.
The bias adjustment of the ERA5 wind speed
Anchor | ||||
---|---|---|---|---|
|
Bias adjustment refers to the process of statistically transforming climate model data to reduce systematic differences between a simulated climate and a reference dataset, usually based on observations, over the historical period. Bias adjustment has become a standard pre-processing step for climate impact studies to adjust climate model output that will drive application models, such as energy models. This is the case for wind speed, a key variable to derive wind power. Specifically, wind power computation depends non-linearly on wind speed (precisely, on its cube). Therefore, significant biases in wind speed can markedly affect the wind energy indicator.
...
Previous evaluations of ERA5 wind speed showed that ERA5 tends to underestimate the intensity of wind speed in most land areas in Europe, except in the North East, while it overestimates wind speed over the sea, particularly in the North Sea and along certain coastlines, such as Southern Norway or Portugal. For this reason, a bias adjustment of ERA5 wind fields is needed. Compared to PECDv4.1, in PECDv4.2 a new methodology was designed to bias-adjust ERA5 wind speeds using the Global Wind Atlas (Davis et al., 2023) as the reference dataset, which is presented in Section 2.3.1, and by applying the Delta Adjustment method, which is described in Section 2.3.2.
Before applying bias adjustment, preliminary corrections (see Section 2.3.3, pre-processing) are also performed on the ERA5 wind speed dataset to address known issues. The effectiveness of these corrections is monitored using four control boxes located in representative regions, as illustrated in Figure 2.3.
Anchor | ||||
---|---|---|---|---|
|
The Global Wind Atlas
Anchor | ||||
---|---|---|---|---|
|
The Global Wind Atlas
Footnote |
---|
...
The Global Wind Atlas dataset is created through a downscaling process that begins with large-scale wind climate data (for example, reanalysis) and ends with microscale wind climate data. The dataset combines information from mesoscale and microscale models, as well as from in situ observational sites, to provide refined and verified estimates of mean wind speed at relevant hub heights and at a high horizontal resolution. The WAsP software (Floors and Nielsen, 2019) performs the downscaling and lastly computes local wind climates every 250 m at five heights (10 m, 50 m, 100 m, 150 m, and 200 m) all over the globe, excluding the North and South Poles and offshore areas beyond 20 km. In PECDv4.2, the Global Wind Atlas version 2 (GWA2), which relies on the ERA-Interim reanalysis as input data, was used to bias-adjust the ERA5 wind speeds.
The Delta Adjustment method
Anchor | ||||
---|---|---|---|---|
|
To reduce biases in climate models, different bias-adjustment methodologies exist. To adjust the ERA5 wind speed at 10 m and 100 m height, the Delta Adjustment method was selected. This method is one of the simplest and least computationally demanding that applies a constant correction based on the difference between the mean values of the model output (source) and the reference data (target) over a defined historical period (Navarro-Racines et al., 2020). By only accounting for changes in the mean of the quantity of interest, the Delta Adjustment method inherently assumes that the only relevant bias is related to the mean of the distribution. For this reason, the Delta Adjustment is typically used for variables that do not exhibit a strong climate-change-related trend, which is, in general, the case for wind speed.
The Delta Adjustment method was applied to ERA5 wind speeds using the GWA2-derived Delta change factors that correspond, in each grid cell, to the ratio between the mean GWA2 and ERA5 wind speeds over the selected reference period (2006-2018). This scaling ensures that ERA5 wind speed mirrors terrain effects captured by GWA2, while maintaining its spatial-temporal consistency. Specifically, since GWA2 only provides the mean wind speed at each grid cell, the bias adjustment does not modify the diurnal cycle of the original ERA5 data. The resulting bias-adjusted wind speed dataset was then extended backwards and forward to cover the whole ERA5 period, 1950–near present.
Bias-adjustment procedure
Anchor | ||||
---|---|---|---|---|
|
The bias-adjustment procedure applied to the ERA5 WS10 and WS100 above-ground involves two steps, detailed below and summarised in Figure 2.4 and Figure 2.5.
Anchor Figure2_4 Figure2_4
...
that was fixed in PECD at the grid-point level by re-computing the10:00 UTC value through the linear interpolation between the 9:00 UTC and the 11:00 UTC values (equivalent to a temporal average) using the 'interp’ function (method = ‘linear’) included in the xarray Python library. Considering the four geographical control boxes illustrated in Figure 2.3, Figure 2.6 shows the original and corrected mean diurnal cycles of the 10 m wind speed computed over the period 2009-2018.
...
Figure 2.6: Effect of the correction of the 10:00 UTC drop in WS10 in each of the four geographical control boxes shown in Figure 2.3. Blue line: original dataset, orange line: corrected dataset.
Regarding GWA2, the Global Wind Atlas wind speeds were selected at the same heights of ERA5 wind speeds (namely, 10 m and 100 m), then averaged over the period 2006-2018, and finally interpolated from their original (250 m) up to the ERA5 horizontal resolution (0.25°) using the 'coarsen' function inlcuded in the xarray Python library. The resulting NetCDF files, containing the mean wind speed of GWA2 and ERA5 at both 10 and 100 m, are described in Table 4.3 and are available for the download on the CDS.
2) Bias Adjustment: Following the methodology described in Section 2.3.2 and summarised in Figure 2.5, the ERA5 WS10 and WS100 were corrected using GWA2 as the reference (also called target) dataset.
Validation of the ERA5 bias-adjusted near-surface wind speeds
Despite the limited availability of long-term and homogeneous wind observations to assess wind fields (Davidson and Millstein, 2022), over Europe the E-OBS dataset
...
offers land-only, station-based, daily means of near-surface (at 10 m above ground) wind speed, at the same horizontal resolution of ERA5 (0.25°) and over the period 1980-2022 (de Baar et al., 2023). The domain of the E-OBS dataset partly overlaps the PECD domain, providing a common area that stretches between the following coordinates: latitudes from 30°N to 72°N, and longitudes from 12°W to 40°E. Using the E-OBS observational gridded dataset as a reference for assessment, the ERA5 bias-adjusted near-surface wind speeds were evaluated.
Figure 2.7 illustrates the spatial distribution of the absolute bias in global means of near-surface (10 m) wind speed computed over the period 1995-2014. The absolute bias corresponds to the difference between the ERA5 reanalysis (before, ERA5_ORIG, and after, ERA5_BA, the bias adjustment) and the E-OBS dataset. The bias adjustment reduces the mean bias between E-OBS and ERA5 wind speeds, with the mean absolute bias moving from 0.55 m s-1(mean relative bias: 23.76%) to 0.41 m s-1(19.41%). The effect of bias adjustment is stronger over north-eastern Europe, where the bias reduces by nearly 1 m s-1, with a final bias lower than 0.5 m s-1 (first vs. second plot in Figure 2.7). Instead, over mountainous regions (for example, the Alps or the Carpathian Mountains) and areas with complex terrain mixing steep slopes and coasts (for example, Norway or the Balkans), the bias increases once wind fields have been bias-adjusted (first vs. second plot in Figure 2.7). Over these regions, the bias in ERA5 bias-adjusted wind speeds shows a similar pattern to the difference between GWA2 and E-OBS, suggesting that over complex terrains, ERA5 inherits the micro-scale information from GWA2 that E-OBS does not provide (second vs. third plot in Figure 2.7).
Anchor Figure2_7 Figure2_7
...
Looking at the temporal correspondence between ERA5 and E-OBS, Figure 2.8 shows the time series of monthly mean wind speeds computed over the period 1995-2014 and over the European domain presented in Figure 2.7. The original ERA5 already captures the temporal variations in wind speed, including the succession of high and low values. The bias adjustment improves the temporal correlation, with the square of the Pearson's coefficient (R2)increasing from 0.67 to 0.72
...
, and brings ERA5 closer to EOBS, with the mean bias decreasing from 0.51 to 0.41 m s-1(Figure 2.8). Moving to the regional scales, Figure 2.9 shows the time series of monthly means and confirms that the bias adjustment has a stronger effect over north-eastern Europe. Over Germany, the mean bias between EOBS and ERA5 shows a similar absolute value before and after the bias adjustment (0.17 m s-1before and -0.19 m s-1after), while over Finland the mean bias decreases from 0.86 m s-1to 0.01 m s-1. Moreover, over Germany ERA5 shows a higher temporal correlation with E-OBS (R2= 0.98) compared to Finland, where some years perform worse than others (R2= 0.35). This is the case for the year 2010, which is highlighted in yellow on Figure 2.9. For this year, Figure 2.10shows the tight temporal correspondence between E-OBS and ERA5 daily means over Germany, while some discrepancies appear over Finland.
...
Figure 2.8: Time series of monthly means of near-surface wind speeds (units: m s-1) computed over the period 1995-2014 and over the European domain illustrated in Figure 2.7. The five solid lines show: (a) the E-OBS dataset (EOBS, green line), (b) the original ERA5 reanalysis (ERA5_ORIG, red), (c) the bias-adjusted ERA5 reanalysis (ERA5_BA, blue), (d) the difference between ERA5_ORIG and EOBS (grey), and (e) the difference between ERA5_BA and EOBS (pink).
Anchor Figure2_9 Figure2_9
Figure 2.9: As Figure 2.8 for two regions located in: (a) Germany (latitudes [50°N-53°N] and longitudes [8°E-12° E], purple box on Figure 2.7; left plot) and (b) Finland (latitudes [64°N-68°N] and longitudes [26°E-30°E]; blue box on Figure 2.7; right plot). The year 2010 is highlighted with a yellow stripe and has been chosen to illustrate the time series of daily means.
...
Figure 2.10: Time-series of daily means of near-surface wind speeds (units: m s-1) for the year 2010 computed over two regions located in: (a) Germany (latitudes [50°N-53°N] and longitudes [8°E-12° E], purple box on Figure 2.7; top plot) and (b)Finland (latitudes [64°N-68°N] and longitudes [26°E-30°E]; blue box on Figure 2.7; bottom plot). The five solid lines show: (a) the E-OBS dataset (EOBS, green line), (b) the original ERA5 reanalysis (ERA5_ORIG, red), (c) the bias-adjusted ERA5 reanalysis (ERA5_BA, blue), (d) the difference between ERA5_ORIG and EOBS (grey), and (e) the difference between ERA5_BA and EOBS (pink).
Footnotes Display
Population-weighted Temperature Anchor Section2_4 Section2_4
Section2_4 | |
Section2_4 |
Population-weighted temperature (TAW) is an important climate indicator included in the PECDv4.2 database. It is particularly relevant for energy conversion and demand modelling, as it provides a temperature metric that reflects the conditions most likely experienced by the population. Rather than averaging temperature uniformly across a region, TAW gives greater weight to areas with higher population density, offering a more realistic estimate of population exposure to temperature variations.
In the PECD framework, TAW is calculated exclusively at the SZON (onshore bidding zones) aggregation level (see Table 2.1for a full list of spatial aggregation levels and their acronyms). This approach allows for a consistent integration of TAW into energy-related applications, such as forecasting demand peaks during heatwaves or cold spells, assessing vulnerability, or planning adaptive infrastructure and policy interventions.
Population mask Anchor Section2_4_1 Section2_4_1
Section2_4_1 | |
Section2_4_1 |
To calculate TAW, a high-resolution population mask is required. For PECDv4.2, gridded population data at 0.25° spatial resolution were sourced from the NASA Socioeconomic Data and Applications Center (SEDAC)
...
The population raster was clipped to the PECD domain and converted to NetCDF format (Figure 2.11), using QGIS-GRASS GIS (Geographic Information System, Open-Source Geospatial Foundation Project
...
). Sea and ocean areas were assigned missing values in accordance with the ESRI ASCII specification. The resulting NetCDF population mask is used throughout the modelling chain and is available for download via the CDS (please refer to Table 4.2 and Table 4.3 for more details).
Anchor Figure2_11 Figure2_11
...
Figure 2.11: Population distribution across the PECD domain based on NASA SEDAC data (2020), mapped at 0.25° resolution. Values represent the number of inhabitants per grid cell.
Computation of Population-weighted temperature Anchor Section2_4_2 Section2_4_2
Section2_4_2 | |
Section2_4_2 |
TAW [°C] is computed by applying the population mask to the gridded TA, both at 0.25° resolution. The calculation is carried out independently for each onshore bidding zone (referring to the aggregation level SZON, as detailed in Table 2.1), using the following equation:
...
is the population in the i-th grid cell of zone z, and n is the number of grid cells in the zone. This results in a weighted average temperature for each zone, reflecting human exposure rather than geographic extent alone.
Figure 2.12 shows the difference between the mean TAW and the mean TA over the period 1991-2020 across the SZON regions.
...
Figure 2.12: Difference between the mean TAW and TA over the climatology 1991-2020 for SZON regions.
Spatial aggregation
Anchor | ||||
---|---|---|---|---|
|
Spatial aggregation is the procedure used to compute regionally averaged indicators from gridded climate and energy data. It enables the transformation of high-resolution outputs into meaningful statistics for specific administrative or market-related regions, such as countries, provinces, or bidding zones. This process is systematically applied to all gridded indicators in PECD to produce corresponding aggregated versions.
Note |
---|
Please note that on CDS, the sub-region selection is only available for gridded datasets. When downloading aggregated time series from CDS, the sub-regional extraction is not supported. |
Required spatial aggregation level for PECDv4.2
Anchor | ||||
---|---|---|---|---|
|
The PECD database supports multiple levels of spatial aggregation, depending on the needs of climate and energy modelling. Table 2.1 below summarises these levels, along with their codes and source definitions.
...
and pan-European regions, official shapefiles were provided by ENTSO-E. Figure 2.13 shows some of the shapefiles used to create the masks.
...
Code | Description of the aggregation level | Source |
---|---|---|
ORIG | Not aggregated | Gridded data |
BIAS | Not aggregated | Gridded data bias-adjusted (see Section 2.3) |
NUT0 | Country | NUTS0+ADMIN0 |
NUT2 | Sub Country/Provinces | NUTS2+ADMIN1 |
SZON | Onshore Bidding Zones | Shapefile provided by ENTSO-E* |
SZOF | Offshore Bidding Zones | Shapefile provided by ENTSO-E* |
PEON | Pan-European Onshore Zones | Shapefile provided by ENTSO-E* |
PEOF** | Pan-European Offshore Zones | Shapefile provided by ENTSO-E* |
CITY | Not aggregated - List of selected cities (only for TA) | List provided by ENTSO-E |
*These shapefiles are not publicly available, but the corresponding NetCDF masks are provided in the CDS under the widget "Weights and masks". Please see Table 4.2 and Table 4.3 for more details.
**In PECDv4.2, the PEOF zones were updated from previous versions by considering a new version of the shapefile (2024/09/19).
...
Figure 2.13: Example of original polygon geometries used to derive float masks for spatial aggregation.
Footnotes Display
Generation of Region Masks for Spatial Aggregation Anchor Section2_5_2 Section2_5_2
Section2_5_2 | |
Section2_5_2 |
To perform spatial aggregation, floating-point NetCDF masks were generated from the shapefiles listed in Table 2.1. One mask was created for each aggregation level, resulting in six region masks: NUT0, NUT2, PEON, PEOF, SZON, and SZOF.
...
These masks allow for accurate area-weighted aggregation, especially near borders and coastlines. An example for Italy (NUT0 level) is shown in Figure 2.14.
All regional masks are available for download from the CDS under the widget “Weights and masks”. Additional details about filenames and structure can be found in Table 4.2 and Table 4.3.
Anchor Figure2_14 Figure2_14
...
Figure 2.14: Example of a float mask, for the Italian NUT0 administrative region, showing the fractions of land around the border and coastlines.
Spatial Aggregation Procedure
Anchor | ||||
---|---|---|---|---|
|
The spatial aggregation of climate and energy indicators is implemented via a Python-based tool, following this workflow:
Input loading:
Load the NetCDF file containing the variable(s) to be aggregated.
Load the corresponding region mask (NetCDF format).
Grid iteration:
Result formatting:
Store the aggregated values for each region in a column of a Pandas DataFrame.
Add the associated time axis from the NetCDF file to the DataFrame.
Export the final DataFrame as a CSV file.
Metadata is attached to the CSV file according to the conventions outlined in the Climate and energy related variables from the Pan-European Climate Database derived from reanalysis and climate projections v4.2: Product user guide Appendix.
Climate indicators Anchor Section2_6 Section2_6
Section2_6 | |
Section2_6 |
This section describes the climate indicators provided in PECDv4.2 for the historical stream. These indicators are derived from the ERA5 reanalysis and are used as inputs for energy modelling and climate analysis across the Pan-European domain.
...
The indicators are available as both gridded products (NetCDF format, spatial resolution of 0.25° × 0.25°) and spatially aggregated time series (CSV format), depending on the level of aggregation.
Table 2.2 summarises the available climate indicators, including their temporal coverage, data source, domain and spatial resolution, temporal resolution, aggregation levels (see Table 2.1), and units. Notably, PECDv4.2 introduces a new reference level — CITY — which provides temperature time-series for a predefined list of cities (see Section 2.5).
Anchor Table2_2 Table2_2
Table 2.2: Climate indicators provided in the PECDv4.2 for the historical stream.
Gridded data (ORIG and BIAS levels) are provided in NetCDF format. All other aggregation levels are delivered in CSV format. Changes that were implemented in PECDv4.2 are highlighted in bold (extended time period and the CITY level).
Variable | Period | Source | Domain / Spatial Resolution | Temporal Resolution | Spatial Aggregation | Units |
---|---|---|---|---|---|---|
2m temperature (TA) | 1950 - near present | ERA5 reanalysis | PECD/0.25° x 0.25° | hourly | ORIG, NUT0, NUT2, SZON, SZOF, PEON, PEOF, CITY | K (gridded) °C (aggregated) |
Population-weighted temperature (TAW) | 1950 - near present | ERA5 reanalysis | PECD/0.25° x 0.25° | hourly | SZON | °C |
Total precipitation (TP) | 1950 - near present | ERA5 reanalysis | PECD/0.25° x 0.25° | hourly | ORIG, NUT0, NUT2, SZON, SZOF, PEON, PEOF | m |
Surface solar radiation downwards (GHI) | 1950 - near present | ERA5 reanalysis | PECD/0.25° x 0.25° | hourly | ORIG, NUT0, NUT2, SZON, SZOF, PEON, PEOF | W m-2 |
10m wind speed (WS10) | 1950 - near present | ERA5 reanalysis | PECD/0.25° x 0.25° | hourly | ORIG, BIAS, NUT0, NUT2, SZON, SZOF, PEON, PEOF | m s-1 |
100m wind speed (WS100) | 1950 - near present | ERA5 reanalysis | PECD/0.25° x 0.25° | hourly | ORIG, NUT0, NUT2, SZON, SZOF, PEON, PEOF | m s-1 |
Energy data
Anchor | ||||
---|---|---|---|---|
|
In collaboration with ENTSO-E, extensive efforts have been made in PECDv4.2 to collect and integrate the widest possible range of energy-related data, which serve both for validating energy models and for training the hydro statistical model. The following sources have been used:
...
5) TSO-provided inflow data: Several Transmission System Operators (TSOs) have provided confidential data under non-disclosure agreements (NDAs). These data include high-resolution generation and storage series, and are not detailed in this documentation.
Footnotes Display |
---|
Exclusion areas
Anchor | ||||
---|---|---|---|---|
|
To ensure that wind and solar energy potential is realistically assessed, several exclusion layers have been applied in PECDv4.2. These masks identify areas unsuitable for energy production due to geographical, environmental, or legal constraints.
...
All exclusion layers have been processed into NetCDF format and are available for download in the Climate Data Store (CDS) under the widget "Weights and masks". These include the "Wind power regions mask" and "Solar PV mask" used in PECDv4.2. For details on file naming conventions and characteristics, refer to Table 4.2 and Table 4.3.
Table 2.3 below provides a full overview of each exclusion criterion, including data sources and variable identifiers.
...
Criteria | Description | Source | Variable Name |
---|---|---|---|
Protected areas | Identifies legally protected regions, such as national parks or nature reserves. Values are binary: 1 indicates a restricted pixel. | World database on protected areas from the United Nations Environment Programme | prot_a |
Polar areas | Identifies polar and subpolar regions based on global land cover classification. Values are binary: 1 indicates a restricted pixel. | Land Cover Classification System from the United Nations Food and Agriculture Organization | polar_a |
Urban areas | Flags areas with urban coverage ≥ 45%. Values are binary: 1 indicates a restricted grid cell. | Land Cover Classification System from the United Nations Food and Agriculture Organization | urban_a |
Water and continental waters area | Classifies water bodies using a three-value system: 0 = land, 1 = ocean, 2 = inland waters. Used to exclude non-land areas. | ERA5 Land-Sea Mask (ECMWF) | watr_a |
High slope area | Identifies steep terrain where slope ≥ 60%. Values are binary: 1 indicates a restricted grid cell. | ETOPO1 Global Relief Model from National Oceanic and Atmospheric Administration (NOAA) | halo_a |
High elevation areas | Identifies regions of high altitude. Values are binary: 1 indicates a restricted pixel. | ETOPO1 Global Relief Model from National Oceanic and Atmospheric Administration (NOAA) | hele_a |
Distance to shore areas | Identifies areas beyond a given distance from the coastline (used primarily for offshore applications). Values are binary: 1 indicates a restricted pixel. | ERA5 Land-Sea Mask (ECMWF) | dist_s |
Energy Conversion Models
Anchor | ||||
---|---|---|---|---|
|
This section outlines the methodologies and implementation details of the energy conversion models used in PECDv4.2, which replaces the previous PECDv4.1 version. These models convert meteorological inputs into power generation time series for four technologies: wind power, solar photovoltaic (SPV), concentrated solar power (CSP), and hydropower. The first three are physical models, while the hydropower module is based on a statistical machine learning approach.
For each energy model, we describe the input data sources, the modelling framework, and the calibration and validation methods used.
Wind Power Conversion Model
Anchor | ||||
---|---|---|---|---|
|
The wind power conversion model simulates generation at the wind power plant (WPP) level and aggregates results to the regional level. The conversion process differs between existing and future wind power installations, reflecting the evolution of wind technologies over time. Existing installations are modelled based on location, capacity, and technology data from WindPowerNet
...
In PECDv4.2, the model has undergone several updates compared to PECDv4.1, including the use of higher-resolution wind climatology data, a more flexible turbine modelling framework, and improved treatment of unavailability. These improvements enhance the realism and accuracy of the simulated wind generation time series.
Climate Data Handling
Wind speed data is sourced from the ERA5 reanalysis and CMIP6 climate model outputs, provided at 0.25° x 0.25° horizontal resolution and, when available, at two vertical levels: 10 m and 100 m above ground.
...
Interpolation is carried out at each time step and for each wind power plant (WPP) individually.
Wind bias adjustment in PECDv4.2
A key methodological improvement in PECDv4.2 is the use of the Global Wind Atlas version 2 (GWA2) for wind speed bias adjustment, replacing the COSMO-REA6 dataset used in PECDv4.1.
...
This two-step process—interpolating ERA5/projection data and adjusting with GWA2 climatology—produces more realistic site-specific wind speed time series and improves alignment with observed power generation, in line with the findings of Murcia et al. (2022).
Conversion to Wind Power Generation
Existing installations
A power curve is estimated for each wind power plant (WPP) using a surrogate model, as detailed in Simutis et al. (2024). The model first constructs a turbine-level power curve from plant-level characteristics and then accounts for intra-farm wake effects to generate a plant-level curve for use in simulations.
This method enables the derivation of a specific power curve for each WPP. Comparisons with turbine-level data from the WindPowerNet database show good agreement, although the generic model excludes the storm shutdown regime (Figure 2.15), which is handled separately.
Figure 2.16 illustrates the surrogate modelling process and its supported parameter space, which covers current European installations and allows for a wide range of future configurations. Air density is fixed at 1.225 kg/m³. Turbulence intensity is set at 10% for onshore and 5% for offshore simulations.
...
Figure 2.16: Overview of the methodology for estimating a plant-level power curve for each WPP, and finally simulating the power generation time series (here for the historical period). Figure is taken from Simutis et al., 2024.
Future Installations
For future onshore wind installations, turbines with specific powers ranging from 198 to 335 W/m² are used, as indicated in Swisher et al. (2022). For offshore wind, turbines with specific powers of 316 and 370 W/m² are simulated. An overview of the simulated future wind technologies is given in Table 2.4 and Table 2.5, which also list the corresponding options found in the widget "Technological specification" in the download form. Each wind technology option is labelled with a number representing a specific combination of hub height (HH) and specific power (SP). For example, "21 (SP316 HH155)" refers to offshore wind power with a specific power of 316 W/m² and a hub height of 155 m. These labels allow users to easily select the desired wind turbine specification from the dataset.
...
The power curve model, as presented in the previous section, is made available in the GitLab repository mentioned above. This allows users to generate plant-level power curves for any combination of specific power, hub height, and plant size, provided they fall within the supported range shown in Figure 2.16.
Anchor Table2_4 Table2_4
Table 2.4: Future technology of onshore wind turbines.
...
Specific Power [W/m2] | Rotor Diameter [m] | Hub Height [m] | Rated Power [MW] | Correspondent codes in the download form on CDS |
---|---|---|---|---|
316 | 269 | 155 | 18 | 21 (SP316 HH155) |
370 | 249 | 155 | 18 | 22 (SP370 HH155) |
Storm Shutdown
Storm shutdown behaviour is modelled as described in Murcia et al. (2021), applying a direct (non-controlled) shutdown for all existing wind power plants (WPPs), using data from the WindPowerNet WPP installation database for the shutdown wind speeds. For future wind technologies, a 25 m/s cut-off is assumed for onshore wind installations, and the HWS (High Wind Speed) Deep type from Murcia et al. (2021) is used for future offshore wind installations (as in the PECD 2021 update). The shutdown procedure is modelled as a 'hysteresis,' where a restart occurs only after the wind speed has dropped to a sufficiently low value for a restart to take place (see Figure 2.17). The storm shutdown is a dynamic model that captures three aspects:
- Individual wind turbine shutdown and restart as each turbine experiences wind speed fluctuations that can exceed 25 m/s (10-minute mean cut-off wind speed), depending on the duration of exceeding the limits, as illustrated in Figure 2.17.
- Plant shutdown does not occur in the same manner as individual turbines; not all turbines in a plant shut down simultaneously, as each turbine experiences slightly different wind speeds at a given time.
- The restart operation happens only at a somewhat lower wind speed than shutdown to prevent cycling between shutdown and restart when the wind speed hovers around the shutdown wind speed (e.g., 25 m/s). More details are provided in Murcia et al. (2021).
...
Figure 2.17: Single-turbine storm shutdown for two storm shutdown technologies. The different shutdown limits (up to 1 s) have been considered in detailed simulations, but a simplified plant-level behaviour (Murcia et al., 2021) is used for the simulations in this service. Figure taken from (Murcia et al., 2021).
Footnotes Display
Simulated locations and wind technologies
The simulated locations and wind technologies depend on the type of run. An overview of the runs is given in Table 2.6.
Anchor Table2_6 Table2_6
Table 2.6: Wind run types.
Run type | ERA5 simulated years | Climate projection simulated years | WPP locations | WPP technology | Losses |
Validation | 2015 - 2022 | Not simulated | Changed every year to match changing WPP installations (based on WindPowerNet data) | Existing WPP parameters based on WindPowerNet data (changed every year), applied in the generic power curve model | Wakes as part of the generic power curve. Other losses (incl. unavailability) are applied as a simple multiplication for onshore, but as a stochastic process for offshore wind. |
Existing | 1950 - near present | 2015 - 2100 | All years with 2020 WPP locations (based on WindPowerNet data) | Existing WPP parameters based on WindPowerNet data (always 2020 fleet), applied in the generic power curve model | Wakes as part of the generic power curve. Other losses (incl. unavailability) are applied as a simple multiplication for onshore, but as a stochastic process for offshore wind. |
Future wind technologies | 1950 - near present | 2015 - 2100 | The best 10-50 % locations (ReGrB) of the unmasked points within each PECD region (in terms of mean wind speed in the bias-adjusted ERA5 data, based on ERA5 grid). A separate run considering only the best 10 % locations (ReGrA) is also provided. | Onshore wind: 3 hub heights and 3 turbine types, so in total 9 wind technologies. A plant of 50 MW with ten 5 MW turbines modelled for each technology. Offshore wind: 1 hub height and 2 turbine types, so in total 2 wind technologies. A plant of 500 MW with 28 18 MW turbines modelled for each technology. | Wakes as part of the generic power curve. Other losses (incl. unavailability) are applied as a simple multiplication for onshore, but as a stochastic process for offshore wind. |
Some notes on Table 2.6:
- All wake modelling considers only intra-farm wake effects (no interaction between separate wind plants).
- Literature suggests a range of 5 % to 10 % for the other losses (Mortensen, 2018). The existing installations cover historical installations over tens of years with older technology, whereas the future installations are new installations (no wear-and-tear considered) with modern technology; it was thus considered fair to place them at the opposite sides of the loss range (5 % for new technologies and on average 10 % for existing installations).
- A mask is used to find the potential points for the Future wind technologies runs. This mask ("wind power regions mask") is available for download in the CDS. Please refer to Table 4.2 and Table 4.3 for more info.
- Locations of existing wind power plants are not considered in the assessment of the 10-50 % best locations for each region. This is done because the decommissioning of old turbines is expected to free up more space for new installations in the future.
- The assumed locations of wind power plant installations within a region significantly impact the expected capacity factor on the aggregate level (Swisher et al., 2022). PECDv4.1 accounted for a single ‘resource grade’ (ReGr), which considers the 10–50% best locations; therefore, no selection option appeared on the CDS download page. In contrast, PECDv4.2 offers the possibility to select between two separate simulations covering the 10% best locations (ReGrA) and the 10–50% best locations (ReGrB). In the future, simulations accounting also for the 50% worst locations—or, in principle, any other distribution split between 0 and 100%—could be provided in later versions of PECD in consultation with ENTSO-E. However, this would multiply the amount of future wind technology time series.
In addition to the plant-level power curves, information on the existing wind power installations is required to simulate generation from the existing fleet. Data from WindPowerNet are used, with the missing technical parameters (turbine type and hub height) estimated based on the machine learning approach from Koivisto et al. (2021). Wind power plants without location or installed capacity information are removed. An overview of the installed capacities (2020 fleet) and key WPP technical parameters for onshore and offshore installations is shown in Figures 2.18 to 2.21.
Anchor Figure2_18 Figure2_18
...
For future wind installations, the starting point is the ERA5 grid points. Based on the exclusion layerspresented in Section 2.8 (specifically, the "wind power regions mask"; please refer to Table 4.2 and Table 4.3 for more info), masking is then applied to these points to select potential future WPP locations. The potential points are shown in Figure 2.22. After selecting the 10-50% best points (referred to as ReGrB, based on 100 m mean wind speeds), the resulting final future installation simulation points can be seen for onshore wind in Figure 2.23 and for offshore wind in Figure 2.24. The selection of 10-50% best points is the average ‘resource grade’ selection following from the work done in Swisher et al. (2022), whereas the best 10 % of locations (ReGrA) represents the best wind sites.
...
Figure 2.24: Offshore locations for the future technology runs for the average resource grade (10-50 % best locations, ReGrB) (left) and the best 10 % of locations (ReGrA) (right). Colouring shows the mean wind speed (m/s) of each location. The locations of existing wind power plants and the repowering of existing plants are not considered.
Aggregation to the regional level
After simulating the hourly generation for each WPP, the results are aggregated at the regional level. For existing installations, regional aggregation is weighted by the installed capacity of each WPP. For future technologies, the same weight is used for each location. From a processing point of view, temporary NetCDF files are used, but the final regional results are saved as CSV files. In addition to power generation, similar weighted regional wind speed averages are saved.
Input data modifications based on measured data and TSO feedback
There is always uncertainty related to the technical wind power plant input data. While the data from www.thewindpower.net is extensive, most countries have significant missing data (most importantly hub height and turbine specific power), so it was considered reasonable to modify the inputs to some extent. E.g., for hub heights, around 60 % of the wind power plants had missing hub height data for Portugal, with 80 % missing for Spain and 70 % for Italy. For specific power, the respective missing data shares were 10 %, 20 %, and 20 %. The modifications are shown below. They were presented for and agreed by ENTSO-E and the respective TSOs. The modifications lead to better fit to measured data. They are done only for the Validation and Existing runs.
...
The wind installation density (MW/km2) data were not found for the different European countries. Installation density is expected to vary based on land availability (Murcia et al., 2022), with countries with a lot of existing wind installations and relatively high population density (e.g., Germany) showing higher density of installations due to land availability constraints. The density is assumed to vary from 15 MW/km2 in Germany (by far the most existing wind installations per km2 in Europe), to 10 MW/km2 in countries with significant wind installations and generally high population density (Belgium, France, Ireland, Luxemburg, Netherlands and the UK) and to 4 or 7 MW/km2 for the other counties. The 4 MW/ km2 installation density is assumed for counties with more limited wind installations or lower population density and thus significant available land (e.g., Bulgaria, Estonia and Finland), with 7 MW/km2 assumed for the rest (e.g., Austria, Denmark and Italy). These installation densities are assumed for both existing and future installations.
Footnotes Display |
---|
Photovoltaic Solar Power Conversion Model
Anchor | ||||
---|---|---|---|---|
|
To estimate photovoltaic (PV) capacity factors at the regional level, a flexible yet robust modelling workflow has been implemented (Saint-Drenan et al., 2018). The method starts by modelling PV output at the individual system level, taking into account the specific location and the orientation and tilt of the module’s plane-of-array (POA). This system-level output is then scaled up to regional values by averaging over a representative set of possible POA configurations, weighted according to metadata from real-world PV installations.
Temporal downscaling
The PV modelling workflow applies a two-step temporal downscaling process to solar global horizontal irradiance (GHI) and 2-metre temperature (TA), ensuring compatibility with the high temporal resolution required for accurate PV capacity factor calculations.
...
A second downscaling step is applied to both historical and projected datasets: the hourly GHI and TA values are further interpolated to a 15-minute resolution. PV capacity factors are calculated at this finer time step and then averaged back to hourly values. This process allows for better representation of diurnal solar variability and reduces artefacts that might result from simpler interpolation methods.
Inferring plane-of-array irradiance: decomposition and transposition
To estimate the irradiance on a photovoltaic module’s POA, it is necessary to convert GHI, which is measured on a horizontal surface, into irradiance on a tilted surface. This requires separating GHI into its two components—direct and diffuse solar radiation—using a decomposition model.
...
This decomposition and transposition process ensures a more realistic estimation of solar radiation on tilted surfaces, which is essential for modelling PV system performance.
PV modelling: optical losses, conversion efficiency, temperature and inverter losses
Before modelling the PV energy conversion process, optical losses due to reflection on the module’s surface must be accounted for. These are modelled using the Martin-Ruiz model (Martin & Ruiz, 2001, 2013), which relates surface reflectance to the angle of incidence and the properties of the PV module’s glazing.
...
This model does not include the effects of module degradation, inverter clipping, or variability in module characteristics (e.g., efficiency, temperature coefficient). However, these simplifications are acceptable given the lack of detailed data on the PV systems installed across the PECD region. Moreover, uncertainties introduced at the plant level are significantly mitigated by the spatial averaging applied during regional aggregation.
Upscaling to Regional PV Aggregation
Modelling PV Typologies
Photovoltaics is a flexible technology that can be installed in many different contexts, which then define the characteristics of a PV installation. Compared to PECDv4.1, which did not consider different PV technologies, PECDv4.2 comprises four different implementations, here designated as typologies. Sharing the same physical modelling framework, each typology is characterised by specific module tilt, azimuth, and thermal losses.
Residential rooftops
By exploring various PV databases, it was possible to infer a first correlation between the latitude of a given location and the mean tilt of installations smaller than 9 kW (here assumed as a proxy for residential PV), which was parametrised as a set of linear functions (Figure 2.25). As it was not possible to collect data for latitudes below 42º and above 54º, it was assumed that the minimum and maximum tilt defined for the covered latitudes stabilise outside of the available range.
...
Because it is quite common for rooftop PV to be installed immediately over the rooftop and sharing its tilt, minimising their aesthetic impact and infrastructure costs, they suffer from a lack of convective cooling on their back side. This is parametrised as a Ross coefficient of 0.34 when calculating PV module temperature (Skoplaki et al., 2009), which for a sunny, hot day – ambient temperature of 30ºC and an incident solar radiation of 1000 W.m-2 – leads to 1.8% higher thermal losses than in PECDv4.1.
Industrial Rooftops
Unlike residential rooftops, industrial rooftops are often flat, providing greater flexibility in engineering design. However, their typically high electricity demand favours installations with very low tilt angles, since it reduces the shading between rows of modules and enables a higher energy density (kWh/m2) at the expense of a lower capacity factor (kWh/kWp).
An exploratory analysis of collected empirical data, along with discussions with PV companies, suggests that these installations are typically mounted with a 10° tilt and exhibit a more varied azimuth than residential rooftops. Consequently, this typology assumes that 50% of installations are oriented southward, while 25% face east and 25% west. These installations are assumed to have lower convective cooling and, thus, the same thermal modelling as for residential rooftop installations.
Utility-scale Fixed
Anchor | ||||
---|---|---|---|---|
|
Module tilt and orientation data for utility-scale PV were inferred from the metadata of hundreds of installations located in France and Germany, for which we can access quality data. Although the PECD spatial coverage is considerably larger, this kind of information is deemed highly sensitive by the industry, making it uncommon to find as open data.
To circumvent the already mentioned limited geographical coverage of the collected data, the tilt was normalised by its theoretical optimum tilt (which maximises the incident irradiation), so that the resulting distribution is more generalizable in space. To calculate the optimum tilt for the PECD region, the PV generation is estimated for different tilts, always south-oriented, using ERA5 data between 2015-2020; the tilt resulting in the highest overall irradiation is selected (Figure 2.26).
Anchor Figure2_26 Figure2_26
...
Figure 2.26: Optimal tilt angle, which maximises annual yield, over the PECD domain calculated considering a 5-year period of ERA5 data.
Based on Figure 2.27, which compares the optimal tilt estimated from ERA5 data with the tilt from actual installations, the utility-scale fixed PV in PECDv4.2 assumes, in each grid cell, a tilt equal to 75% of its theoretical optimum. This discrepancy is most likely due to a common engineering practice: by reducing the shadowing between rows of modules, a higher energy density – i.e., a higher kWh generated per unit of area – and lower land requirements can be achieved. The tilt ratio and orientation are visualised as a 2D histogram, which shows that both parameters can be well described by two normal distributions (Figure 2.28).
Anchor Figure2_27 Figure2_27
...
Figure 2.28: Empirical distribution of the PV installations' tilt and azimuth angles, with the first being relative to the local optimal tilt angle, as well as inferred normal distributions.
Utility-scale Tracking
Among the various tracking configurations, PECDv4.2 includes only horizontal single-axis tracking (HSAT), as it is currently the most widely adopted. In this setup, PV modules are mounted horizontally on a single axis aligned North-South, rotating daily from East to West as the day progresses. This configuration increases capacity factors, particularly in summer and when the Sun is higher in the sky. The tracking modelling, implemented using the pvlib Python library, also accounts for back-tracking, a widely used strategy that adjusts module positioning when the Sun is low to minimise shading between rows, even if it deviates from the optimal angle for energy capture. This is done considering a ratio between the PV array area and the corresponding land use, or equivalently, the inverse of the axis spacing, of 0.35.
While the movement of the modules may impact their thermal performance throughout the day, for PECDv4.2 this typology was assumed to share the same assumptions as the Utility-scale fixed case.
Application of Exclusion Areas and Spatial Aggregation
Once the PV capacity factor product is generated for the PECD-constrained ERA5 grid, regional estimates for bidding and study zones are calculated by means of a spatial average. However, it is important to note that particular (restricted) areas were masked in both the grid-like and regional-based products to produce more accurate results. Specifically, sea and ocean areas (thus, offshore PV), polar and protected areas, as well as locations with high elevation (above 2000 m a.s.l.) or slope (higher than 10%) were excluded from the computation (please refer to Table 4.2 and Table 4.3 for more details). While high elevation may be unsuitable as an exclusion criterion at a global scale (notably for Chile), we found that for the PECD area, this does not pose issues in terms of final PV estimates. Figure 2.29 shows the composite exclusion mask considered for the computation of the solar photovoltaic technologies. The information to identify such regions was obtained from a range of sources: the ERA5 Land-sea mask, the Copernicus Land cover classification gridded maps, the World Database on Protected Areas (WDPA) and Other Effective Area-based Conservation Measures (OECM), and the ETOPO1 bathymetric and topographic digital elevation model.
...
Figure 2.29: Composite exclusion mask considered for the solar photovoltaic technologies.
Making use of typology-level data
ENTSO-E’s adequacy studies are based on the integration of the capacity factor data from the PECD with the structural data of the European power system – such as installed capacities by technology – submitted by TSOs to the Pan-European Market Modelling Database (PEMMDB). Thus, to align with the increased granularity of PECD version 4.2, TSOs began, in 2024, reporting installed PV capacity per typology. While the first data collection indicated a generally positive adoption of the new framework, it may still present challenges for TSOs, as it requires more detailed technological roadmaps. At the same time, it offers an opportunity to not only simulate pre-defined energy systems, but also to test and compare alternative technological scenarios. From a broader perspective on the potential end users of this data, complementing these typology-level timeseries with cost assumptions could support more detailed and realistic energy optimization studies.
...
This challenge has been clearly identified with the release of PECDv4.2 and has been under investigation. Future efforts will focus on a more extensive data collection and validation, both at the typology- and aggregated-level.
Improvements over Previous Methodology
PECDv4.2 introduces a significant shift in the modelling of regional-level PV timeseries. While previous versions relied on a single model, the current version acknowledges the diversity of PV implementations. It establishes a base modelling framework that incorporates specific parameterisations tailored to different installation types, ensuring a more accurate representation of PV technology variations.
In particular, it accounts for variations in tilt and azimuth angles, which affect both the daily and seasonal generation profile, as well as optical losses from reflection. Additionally, it considers ventilation conditions, which influence module temperature and, consequently, thermal losses.
Concentrated Solar Power Conversion Model
Anchor | ||||
---|---|---|---|---|
|
As specified in the work plan, the concentrated solar power (CSP) model developed by DTU and used in the previous version of PECD is also employed in this release. A brief description of the model is provided below.
...
If the solar field generates more energy than required to operate at rated power, the surplus is stored.
If the solar field generates less, the storage discharges energy to maintain rated power (see Figure 2.30).
This strategy does not require knowledge of market prices. The relationship between the solar multiple and the thermal energy storage size remains consistent with the previous PECD version (see Table 2.7). The model has been recalibrated using updated climate data.
...
TES (hours) | SM |
0 | 1.5 |
3 | 1.75 |
6 | 2.0 |
9 | 2.5 |
12 | 2.9 |
18 | 3.0 |
Hydro Power Conversion Model
Anchor | ||||
---|---|---|---|---|
|
For the historical stream, the goal for the Hydropower (HP) model is to reproduce the hydropower energy indicators starting from climate data, reconstructing their time series for the historical period (1979-2022).
...
The starting point of the work is the publicly available generation data (in MW) that can be accessed through the ENTSO-E Transparency Platform (TP) with which the model has been trained and validated to produce the results up to December 2021. The data include hydropower generation timeseries (at a resolution of 15 min, 30 min, or 1 hour depending on the country), Installed Capacity time series (annual), and Stored Energy (SE) time series to reservoirs (also referred to as ‘Filling Rates’) and pumped storage (at weekly resolution). Since these data are not sufficient to yield a complete dataset for simulations, two additional sources have been employed: (1) data provided directly by TSOs and (2) inflow data from the previous PECDv3.1 (see Table 2.11 for more details). The three sources were ranked following data reliability in accordance with ENTSO-E: in particular, TSOs' data are accounted for as the most reliable and are ranked with the highest priority. This data includes generation and pumping timeseries at hourly resolution and NUT0 or PECD granularity. Some TSOs provided timeseries of stored energy for their countries at weekly resolution for reservoir and open-loop pumped storage technologies, which were used to estimate inflows for such technologies (see countries citing ‘TSO’ as a source under inflow columns HRI and HOL, Table 2.11). Additionally, some countries provided monthly timeseries of Installed Capacity (IC), which were useful to account for significant changes in generation due to new installations throughout the historical time series (this information was used for countries citing ‘rescaled using monthly IC’, Table 2.11).
Where TSO timeseries are not complete, TP data are used, with some exceptions (see section Estimating Inflows). Finally, PECDv3.1 data have been employed where TSO and TP data are not sufficient. Especially, they help in completing the open-loop pumped storage inflow data, since only a few TSOs are able to share stored energy timeseries for this technology.
...
The following sections describe the statistical model, the pre-processing of input data, the validation procedure, and the use of the model to reconstruct historical data and estimate future projections. Finally, the last section describes the adopted methodology to estimate the inflows starting from the available data.
The Statistical Model
The statistical model here adopted is the Random Forest Regression model (Pedregosa et al., 2011; hereafter, the RF model), a machine learning model based on ensemble learning, which already proved to work well at such a resolution and broad domain in a previous study by Ho et al. (2020). In a preliminary comparison, at the first stages of the project, the model also proved a comparable performance over France for both HRE (Reservoirs) and HRO (Run-of-river) technologies with respect to a Neural Network fed by discharge data (a model employed in the current PECD).
The Random Forest takes as input the generation (or inflow) data, namely the target variable, and some climate datasets covering the same time period, the predictors, and trains a large number of decision trees to predict the target variable starting from the predictors. In the end, it averages the answers from all the trees to obtain the model prediction. The number of trees in the ‘forest’, and their characteristics, can be adjusted by tuning several parameters.
Energy data pre-processing
Regarding the pre-processing of energy data, the hydropower generation, Installed Capacity and Stored Energy time series are extracted from a larger database for each PECD country and re-organized in multiple CSV files. Similarly, also TSO and PECDv3.1 data are organized into analogous CSV files. Where needed, the generation data is resampled to 1h. A weekly aggregation follows and consists of a sum of the hourly values for those weeks where at least 80% of data are available. If this holds true, the gaps in hourly values are filled by a simple interpolation. If the week presents >20% of missing values, the whole week is set to NaN. Specific checks are also made for the first values of the timeseries, as they are often unphysical, in which case they are adjusted based on adjacent values or set to NaN.
...
Finally, while the generation can be directly employed as a predictor of the RF model, the inflows must first be estimated starting from the available data (see section Estimating Inflows) and then modelled.
Climate data preprocessing
For the purposes of this application, the most informative variables that can be found in all climate datasets are 2-m temperature (TA [K]) and total precipitation (TP [m])
...
, which are commonly fed to hydrological models to compute river discharge. In particular, the two variables are useful if averaged (for TA) and cumulated (for TP) over multiple weeks preceding the time of the estimation of the generation or inflow. It is important, for instance, to consider the time lag between a precipitation event over a given area, and the corresponding discharge water reaching the hydropower plants downstream. Therefore, precipitation is cumulated over up to 30 weeks, while temperature is averaged over up to 15 weeks. According to the example of Table 2.8, if the model is used to estimate the HP generation produced for the week of 2015-01-05, it will take as predictors the TA and TP for that same week, as well as the average TA of the previous 2, 3, 4, …, 15 weeks, and the cumulated over the previous 2, 3, 4, …, 30 weeks.
...
The datasets are aggregated at weekly resolution (summing precipitation and averaging temperature) and then the lags up to 30 weeks are calculated, meaning that values are cumulated (summed/averaged) over multiple weeks to yield several more datasets, which will be used as predictors for the RF model. At the end of this pre-processing step, one CSV file per country and climate dataset is produced.
Footnotes Display
Model validation: Leave-One-Year-Out Validation
The model is validated separately for each SZON region and indicator, over the period of energy data availability (within 2015-2022 in case of TP data, 2010-2022 in case of TSO data, 2010-2017 in case of PECDv3.1 data). The validation procedure followed is the Leave-One-Year-Out (LOYO), which trains the model over all N available years except one (test year), and evaluate the model performance over this test year. This is repeated N times, keeping one year as the test year, until the complete estimated time series can be assembled (see Figure 2.31).
Anchor Figure2_31 Figure2_31
...
For instance, in the case of the modelled time series in Figure 2.31, the NSE value is 0.59 (as also reported in the upper left corner of the figure). The metric is calculated as one minus the ratio between the variance of the modelled timeseries and the variance of the observed timeseries. If there is no difference between the modelled (m) values and the observed (o) ones at each timestep (i), then the NSE will be 1 (perfect fit), which is the maximum value that can be reached. On the other hand, if there are significant differences between the two timeseries, the NSE can reach negative values (up to -Inf). An NSE = 0 would indicate that the model has the same predictive skill as the mean of the timeseries in terms of the sum of the squared error.
RF Model Parameters
As mentioned, the Random Forest can be built by specifying several parameters. The main parameters indicated in Table 2.9 have been tuned country by country and indicator by indicator. This has been done by sampling a hyperparameter space with the Latin Hypercube Sampling algorithm to find the set able to optimize a selected metric. The hyperparameter space has been defined by assigning a range of values to each of the main RF parameters. To efficiently sample this multidimensional domain, a Latin Hypercube Sampling of 1000 samples has been performed and each sampled set of parameters has been tested via LOYO procedure to yield the score of the chosen metric. Finally, the set of parameters yielding the best score was retained and used for that specific country and indicator.
...
. However, this metric requires longer computational times and, in a few cases, brings unphysical results. Therefore, the proposed results are obtained with RF parameters optimized using NSE.
Footnotes Display |
---|
Model Validation Results
To summarize the validation results, a map displaying the NSE scores obtained for each country is visible in Figure 2.32 for the generation and inflow to reservoirs, the inflows to run-of-river, the inflows to pondage, and the inflows to open loops. Generally, over the PECD domain, the results are satisfactory, with fairly high NSE values for most countries. This is especially seen for the inflow to reservoirs indicator (panel b), which assimilates information on the reservoirs filling rates (for the countries that provide it) and hence is able to reduce the human influence on the generation signal, while generation signal without this information can be harder to reproduce with a model based on temperature and precipitation alone (see panel a). High scores are obtained also for inflows to run-of-river and pondage (panels c and d), where the signal has a more distinct seasonality and is less influenced by human intervention. The scores are generally lower for inflows to open-loop (panel e), largely based on PECDv3.1 data.
...
Figure 2.32: maps of the LOYO validation results obtained in terms of NSE over the period of available data which depends on the source (TSO: 2010-2022, TP: 2015-2022, PECDv3.1: 2010-2017). The four panels each refer to a different inflow (or generation) indicator, as reported in the panels’ titles.
Modelling Historical stream
Once the model is validated, it is trained (again for each country and indicator) on all available years of generation data using the tailored sets of parameters found during the optimization procedure. The same parameters are then used to extend the HP indicator back to 1950, to have long reconstructed time series, using the ERA5 temperature and precipitation data. Figure 2.33 shows an example of a historical time series of inflow to reservoirs as estimated by the RF model for France (in blue). It also shows the ‘observed’ inflow series in grey, estimated with TP data (see section Estimating Inflows).
...
Figure 2.33: RF-reconstructed time series of inflow to reservoirs (HRI) for France (FR). The estimated series is shown in blue, while the observations (2015-2022) are in grey, starting from the dashed line.
Estimating Inflows
The RF model produces generation timeseries, although artificial regulations can significantly impact the timeseries and affect its seasonality, jeopardizing the capability of Temperature and Precipitation to reliably reproduce said signal. This issue regards specific technologies involving a reservoir, especially Reservoir and Open-Loop pumped storage systems, while the effect can be in general neglected for run-of-river plants and pondage plants, which are run-of-river plants making use of a limited storage capacity amounting to no more than 24 hours.
...
This roundtrip efficiency usually depends on the design of the plant. For older designs it may be lower than 60%, while for recent ones it can be up to 90%. The suggested efficiency from ENTSO-E is 0.75, so we’ll assume this to be the reference value over Europe. As seen in Figure 2.35 for a French Closed-Loop unit, the balance holds as the production and pumping terms are cumulated over time and the natural inflow remains null.
Inflow to Open-loop Pumping
The situation for open-loop facilities, which is sketched on Figure 2.36, is different since the natural inflow component isn’t null, and therefore constitutes a third unknown, together with the two efficiencies. The assumption that one can make is to consider the pumping and production efficiencies as equal (
...
Figure 2.36: An approximated sketch of an Open-loop system.
Inflow to Reservoirs
As for reservoirs, the pumping component is null, so the equation reduces to:
...
It must be noted that TP data has been cautiously used to compute inflow to reservoirs, since the stored energy data on the platform refers both to reservoirs and pumped storage technologies. Hence, the inflow results from the TP have been retained only in a few cases, generally where the reported installed capacity for reservoirs is much greater than the one for pumped storage.
Inflow to Run-of-rivers and Pondage
For Run-of-river systems, the storage term is considered null, and considering that the storage capacity of a pondage is less than 24 hours, the same is assumed for run-of-river with pondage at weekly resolution, hence reducing the equation to:
...
When possible, the two technologies are kept separate. For instance, this is possible for the bidding zones whose TSO provided distinct generation time series. Data from the TP, on the other hand, are used to model run-of-river technology only in case no pondage was declared for that bidding zone by the TSO, nor was pondage available in the PECDv3.1 dataset. This to make sure that the sole run-of-river was being addressed, given the TP generation data includes both technologies (addressed as ‘Run-of-river and pondage’). If only run-of-river data were provided by the TSO for a given bidding zone, the run-of-river inflow was calculated starting from this data, while the pondage inflow was calculated starting from the PECDv3.1 data. Comments on these particular cases are left in the Summary Table (Table 2.11).
Finally, the same production efficiency is assumed for all technologies (
...
), however, to align with the models used by ENTSO-E to ingest the energy data, the final inflow model outputs are multiplied back by the same efficiency coefficient to obtain an inflow at the electrical grid level. Although the balance equations should bring to close-to-reality estimates, it must be noted that not having access to actual inflow observations, it is not possible to fully validate the above methodology.
Use of PECDv3.1 inflow estimates
In case TSO and TP data were not sufficient to complete the inflow for a specific bidding zone and a specific technology, the PECDv3.1 inflow data were used directly as the target variable for the training of the RF model as indicated in Figure 2.37. This approach was especially used to model inflows to open-loop pumped storage as only a few stored energy time series were provided by the TSOs. Therefore, there are cases in which the generation is modelled starting from available TSO data, while the corresponding inflow (for the unavailability of stored energy data) is modelled starting from PECDv3.1 data, bringing up sometimes inconsistencies between the two datasets. The main ones are reported in the Summary Table (Table 2.11).
Anchor Figure2_37 Figure2_37
Figure 2.37: Sketch of the two different approaches to model inflows: approach 1 makes use of TSO and TP data, approach 2 makes use of PECDv3.1 data.
Post-hoc corrections following TSOs’ feedback
For the produced inflow datasets of some specific technologies and regions, a multiplicative correction factor was applied to the model outputs in agreement with the TSO of interest, after validation against a reference dataset. These correction factors were hence required due to the poor quality of the public data initially used for the model training and are to be regarded as temporary adjustments ahead of a more stable solution. See Table 2.10 for an overview of the explicit multiplicative values, and the regions to which these were applied for the PECDv4.2 delivery of data.
...
Region | Technology | Correction Factor | Source |
AT00 | HRI – inflows to reservoirs | 2404/5507 | Comparison of mean maximum generation with an internal APG data source with strict sharing limitations. |
HRR – inflows to run of river | 23082/17760 | ||
HPI – inflows to pondage | 5607/4506 | ||
CH00 | HOL – inflows to open-loop pumped storage | 0.825 | Comparison of mean annual cumulated inflows with a reference monthly dataset derived from Swiss Federal Office of Energy (SFOE) data. |
HRR – inflows to run of river | 1.39 | Comparison of mean annual cumulated inflows with a reference monthly dataset (SFOE). Mind: this factor was applied directly to the model input TSO data in accordance with the Swiss TSO. | |
TR00 | HRR – inflows to run of river | 2.502 | Comparison of mean annual cumulated inflow with an internal series of annual cumulated generation for period 2019-2023 including all country plants. |
HRI – inflows to reservoirs | 1.850 | Comparison of mean annual cumulated inflow with an internal series of annual cumulated generation for period 2019-2023 including all country plants. |
Summary Table
Table 2.11 includes all addressed bidding zones and technologies (except for generation from run-of-river and pondage, which would be a repetition of the respective reported inflow columns) and can be used to check the availability of data, source of data used for the modelling, and comments on the results, mainly addressing inconsistencies found or considerations made for the source/modelling choices. As mentioned, the TSO generation data have always been given priority when available, followed by TP data and PECDv3.1 estimates. Given the different data sources and methodology used, the results can significantly differ from the ones of the previous PECD, therefore we strongly recommend checking with TSOs about the reliability of mean generation/inflow historical values.
...
Reservoirs Generation | Inflow to Reservoirs | run-of-river Inflow | Inflow to Open Loop PS | Pondage Inflow | |
Bidding zone / Tech. | HRG | HRI | HRO | HOL | HPO |
AL00 | TSO rescaled using monthly IC | TSO rescaled using monthly IC | TSO rescaled using monthly IC | ||
AT00 | TSO | TP – the mean using PECDv3.1 data is too low with respect to TSO data, hence using TP data although SE is surely affected by HPS (Hydro Pumped Storage) | TSO | PECDv3.1 | TSO |
BA00 | TSO | PECDv3.1 | PECDv3.1- TSO run-of-river data not provided – might be already accounted for in TSO pondage data | PECDv3.1 | TSO |
BE00 | TSO | ||||
BG00 | TSO | TSO | TSO | TSO | |
CH00 | TSO | TSO – rescaled using monthly IC | TSO - rescaled using monthly IC – multiplication factor of 1.39 applied to generation input data in accordance with CH00 TSO | TSO - rescaled using monthly IC | |
CZ00 | TSO | PECDv3.1 | TP (since there’s no pondage) – can reproduce mean signal, can’t well reproduce the peaks – suspected anthropic factors influencing the production after 2019 | PECDv3.1 | |
DE00 | TSO | PECDv3.1 – mean too low with respect to TSO generation, should be ca three times higher | TSO | PECDv3.1 | |
ES00 | TSO | TSO | TSO | TSO | |
FI00 | TSO | TSO | TP (no TSO pondage data, no PECDv3.1 pondage data) | ||
FR00 | TP | TP – HPS (pumped storage) IC about 60% of HRE (reservoirs) IC in past 8 years (from TP data) + time series very close to PECDv3.1 inflow | TP (no TSO data for FR, no pondage in PECDv3.1 data) | GPU (Generation Per Unit) - (no PECDv3.1 data for FR) - low reliability: no HOL storage energy available (approximated inflow assuming negligible storage from one week to the other) + few production and pumping data (3 years) | |
GR00 | TSO | TSO | TSO – model training on last 4 years (missing monthly IC data to rescale) – significant difference with PECDv3.1 inflow | TSO | PECDv3.1 – even though no pondage data from TSO nor TP |
HR00 | TSO – very close to TP generation | TP – HPS IC about 20% of HRE IC in the past 9 years (TP data) | TSO – could contain pondage | PECDv3.1 | PECDv3.1 – even though no pondage data from TSO. |
HU00 | TSO rescaled using monthly IC | ||||
IE00 | TSO | ||||
ITCA | TSO | PECDv3.1 – reasonable values with respect to TSO generation | TSO | ||
ITCN | TSO | PECDv3.1 – inflow sometimes lower than TSO generation | TSO | ||
ITCS | TSO | PECDv3.1 – inflow very close to TSO generation | TSO | PECDv3.1 | |
ITN1 | TSO | PECDv3.1 – inflow very close to TSO generation | TSO | PECDv3.1 | |
ITS1 | TSO | PECDv3.1 – inflow close to generation (would expect it a bit higher) | |||
ITSA | TSO | PECDv3.1 – high with respect to TSO generation | TSO | ||
ITSI | TSO | PECDv3.1 – low peaks with respect to TSO generation | TSO | PECDv3.1 | |
LT00 | TSO – generation values exceptionally high for the year 2015 (something wrong in the data) -> left out of training | ||||
LV00 | TSO | ||||
LU00 | TSO | ||||
ME00 | TSO – close to tp generation data, higher peaks | TP – no HPS IC | PECDv3.1 | ||
MK00 | TSO | TSO | |||
NL00 | PECDv3.1 | ||||
NOM1 | TSO | TP – small HPS production compared to HRE | TSO | PECDv3.1 | |
NON1 | TSO | TP - no HPS | TSO | ||
NOS1 | TSO | TP – no HPS | TSO | - | |
NOS2 | TSO | TP – trying splitting PECDv3.1 NOS0 data obtained similar result + small HPS production | TSO | PECDv3.1 (splitting PECDv3.1 NOS0 data according to mean TSO generation data for NOS2) | |
NOS3 | TSO | TP - trying splitting PECDv3.1 NOS0 data obtained similar result + small HPS production | TSO | PECDv3.1 (splitting PECDv3.1 NOS0 data according to mean TSO generation data for NOS3) | |
PL00 | TSO | PECDv3.1 – mean inflow value is 3-4 times higher than TSO generation (also TP-calculated mean is 3-4 times higher) | TSO - rescaled using monthly IC | PECDv3.1 – inflow seems to be too low considering TSO generation and pumping series: ca 200 MWh of inflow against 1200 MWh of generation (mean weekly values) | |
PT00 | TSO | TSO | TSO – values seem low, tp and PECDv3.1 data ca 10 times higher than TSO data of run-of-river and HPO together | TSO - rescaled using monthly IC | TSO |
RO00 | TSO | PECDv3.1 | TSO | PECDv3.1 | |
RS00 | TSO | PECDv3.1 – TP data significantly impacted by HPS | TSO | ||
SE01 | TSO | PECDv3.1 | |||
SE02 | TSO | PECDv3.1 | |||
SE03 | TSO | PECDv3.1 | |||
SE04 | TSO | PECDv3.1 | |||
SI00 | TSO | - | TSO – could contain pondage | PECDv3.1 – no pondage generation data from TSO: keeping PECDv3.1 trained estimates. Pondage could be included in run-of-river TSO data? In this case PECDv3.1 estimates are off. | |
SK00 | TSO | PECDv3.1 – although mean is considerably higher than TSO generation | TSO | PECDv3.1 | TSO |
TR00 | |||||
UK00 | TP – (no TSO data for GB, no pondage in PECDv3.1 data) |
Energy indicators
Anchor | ||||
---|---|---|---|---|
|
Energy indicators included in the PECDv4.2 dataset for the historical stream are described in Table 2.12. This table provides information for each variable, including the typology, the time period covered, the source of the input data, the domain and spatial resolution, the temporal resolution, the spatial aggregation (as specified in Table 2.1), and, where applicable, the different technologies used to compute the final time series.
...
***Inflow data from ENTSO-E PECDv3.1
Known issues
There are no known issues.
Projection stream
Projection models
Anchor | ||||
---|---|---|---|---|
|
Choice of models
The projection dataset in PECDv4.2 has been designed to provide robust climate and energy indicators for the entire PECD domain, extending up to the year 2100. As a first step in building this dataset, a careful selection of climate projections was carried out to identify the most appropriate subset for energy-sector applications.
...
Rather than relying on complex performance-based metrics to evaluate how well each model reproduces historical climate conditions, the selection was primarily guided by Equilibrium Climate Sensitivity (ECS) values. This criterion, also used in IPCC AR6, allows for selecting a representative ensemble that spans the range of projected climate sensitivity, including models with higher sensitivity to capture "low-likelihood, high-impact" futures. The selection also aimed to reduce redundancy by minimising model overlap (i.e., models developed with similar components or structures). The results are presented in Table 3.1.
Anchor Table3_1 Table3_1
Table 3.1: Models are colour-coded based on exclusion criteria: dark red indicates models that do not provide all required scenarios; orange highlights models with Equilibrium Climate Sensitivity (ECS) values outside the range assessed in the IPCC AR6; yellow marks models that share components with others in the ensemble. The models that were retained for PECDv4.2 are highlighted in bold.
...
The final selection of models and their characteristics is reported in Table 3.2.
Anchor Table3_2 Table3_2
Table 3.2: CMIP6 models used in the projections stream and their corresponding characteristics and nodes for downloading.
The models and scenarios indicated in bold are the ones that have been introduced in PECDv4.2, while the other ones were already present in PECDv4.1.
...
Note that the historical simulation period is chosen to ensure overlap between ERA5 and the CMIP6 models, enabling the computation of bias adjustment.
Footnotes Display |
---|
Data retrieval
CMIP6 variables (for each model) are downloaded from the ESGF node using a Python script that utilises a specific Python API. The script only accepts a configuration file as an argument, which contains the desired tags for the download. This script is used for downloading both historical and projection data. Table 3.2 lists the nodes from which each model has been downloaded. The selected CMIP6 climate models are also available in the C3S catalogue, however the high temporal resolution (namely, 3 hourly) needed to produce the PECD database was not available at the C3S. For this reason, the CMIP6 model output have been collected via the ESGF nodes.
Footnotes Display |
---|
Spatial interpolation
Anchor | ||||
---|---|---|---|---|
|
Starting from a common 100 km nominal spatial resolution and global domain, each model has its own grid, necessitating spatial interpolation to the PECD domain at 0.25° x 0.25°. This interpolation uses the bilinear method as implemented in the CDO
...
A Python script iterates over the files and, using the os
library, calls the CDO command line for each file. Another Python script in the pre-processing pipeline checks the output files for missing (Not A Number, NaN) and anomalous values, and reformats them according to ERA5 conventions.
Footnotes Display |
---|
Temporal aggregation and interpolation
Anchor | ||||
---|---|---|---|---|
|
As stated in Section 3.1, one of the selection criteria for projection models is the finest available temporal resolution (3 hours). However, it is necessary to apply temporal interpolation to achieve the required hourly resolution for the PECDv4.2 database. Table 3.3 shows the method used to temporally interpolate each variable.
...
It is important to note that to obtain files according to the ERA5 conventions and to have the first hour as 00:00 for the projections, it is necessary to use the last day of the historical scenario, considering that the different SSP scenarios start from 03:00. Figure 3.1 contains a validation of this method considering the TA variable at a generic point of the PECD domain.
...
The SG2 library can be installed via "pip" in any Python environment. The detrended Kt time series is then downscaled to an hourly resolution using linear interpolation. The data is subsequently reconverted to GHI by multiplying it with an hourly-averaged TOA value. Figure 3.2 shows a validation plot for this procedure, computed at a generic point within the PECD domain.
...
The required variable for precipitation is total precipitation (TP), which has been derived from the precipitation flux (in kg m⁻² s⁻¹), the original data format for CMIP6 projection models. Since energy models require daily cumulative data, the downloaded precipitation flux data was first resampled to daily averages using the xarray.resample().mean()
method. This daily average was then multiplied by 86.4 to convert the data into daily precipitation in meters.
Footnotes Display |
---|
Bias-adjustment procedure
Anchor | ||||
---|---|---|---|---|
|
Concerning the projection streamflow, two bias adjustment methodologies have been implemented for the CMIP6 projection datasets. These methodologies are:
...
CDFt Method: This method is used for variables with a strong climate-change-related trend, such as temperature. To correctly account for the trend, a 20-year time series is considered for the calculation of the CDFs, with only the central 10-year window taken as the adjusted data. The 20-year timeframe is then moved forward, yielding a new 10-year central window that partially overlaps the window of the previous step. Despite wind speed and precipitation (WS10 and TP) not exhibiting a strong climate change trend, their correction is also based on the CDFt method. This is because the mean factors in the Delta method could potentially lead to negative (and therefore unphysical) values. For these variables, given the lack of a strong climatic trend, the CDFt considers a ‘static’ 20-year time series.
Figures 3.3, 3.4 and 3.5 illustrate the logic blocks of the bias-adjustment procedure applied to the 2m temperature (TA), total daily precipitation (TP), 10 m wind speed (WS10), and to the surface solar radiation (GHI), respectively.
...
Figure 3.5: Details of the bias-adjustment logic block for the projection global horizontal irradiance (GHI) using the Delta Adjustment method.
Climate indicators
Table 3.4 lists the climate indicators for the projection stream. The final domain and spatial resolution, as well as the final temporal resolution, are obtained through preprocessing as described in Section 3.3 and Section 3.4, respectively. The bias adjustment has been applied using the procedures detailed in Section 3.5. Since wind speed at 100 m above the ground is not available for the CMIP6 projection models, and to maintain consistency between the wind speed at 100 m in the historical (ERA5) and the projection datasets, the wind speed at 100 m is calculated using the near-surface (10m) wind speed of the CMIP6 projection models together withthe Alpha Coefficient (or power law) derived from the ERA5 reanalysis (see Section 2.2 for more details).For the projection stream, the computation of TAW and the spatial aggregation follow the same methodologies described for the historical stream (see Sections 2.4 and 2.5, respectively). It is important to note that all variables are bias-adjusted except for TAW and WS100, because they are both derived from bias-adjusted variables (TA and WS10, respectively).
Anchor Table3_4 Table3_4
Table 3.4: Climate indicators provided in the PECDv4.2 for the projection stream. Files provided at the BIAS spatial aggregation level (specifically, bias-adjusted data; see Table 2.1 for further info) are gridded (NetCDF format), while all the other levels of aggregation are provided in a CSV format. Changes that were implemented in PECDv4.2 are highlighted in bold (extended time period, additional climate projection models and climate scenarios, new spatial aggregation over selected cities).
Variable | Period | Source | Models | Scenario | Domain/ spatial resolution | Temporal resolution | Spatial aggregation | Units |
---|---|---|---|---|---|---|---|---|
2m temperature (TA) | 2015-2100 | CMIP6 projections | AWCM, BCCS, CMR5, ECE3, MEHR, MRM2 | SSP126, SSP245, SSP370, SSP585 | PECD/0.25° x 0.25° | hourly | BIAS, NUT0, NUT2, SZON, SZOF, PEON, PEOF, CITY | K (gridded) °C (aggregated) |
Population-weighted temperature (TAW) | 2015-2100 | CMIP6 projections | AWCM, BCCS, CMR5, ECE3, MEHR, MRM2 | SSP126, SSP245, SSP370, SSP585 | PECD/0.25° x 0.25° | hourly | SZON | °C |
Total precipitation (TP) | 2015-2100 | CMIP6 projections | AWCM, BCCS, CMR5, ECE3, MEHR, MRM2 | SSP126, SSP245, SSP370, SSP585 | PECD/0.25° x 0.25° | daily | BIAS, NUT0, NUT2, SZON, SZOF, PEON, PEOF | m |
Surface solar radiation downwards (GHI) | 2015-2100 | CMIP6 projections | AWCM, BCCS, CMR5, ECE3, MEHR, MRM2 | SSP126, SSP245, SSP370, SSP585 | PECD/0.25° x 0.25° | hourly | BIAS, NUT0, NUT2, SZON, SZOF, PEON, PEOF | W m-2 |
10m wind speed (WS10) | 2015-2100 | CMIP6 projections | AWCM, BCCS, CMR5, ECE3, MEHR, MRM2 | SSP126, SSP245, SSP370, SSP585 | PECD/0.25° x 0.25° | hourly | BIAS, BIAS, NUT0, NUT2, SZON, SZOF, PEON, PEOF | m s-1 |
100m wind speed (WS100) | 2015-2100 | CMIP6 projections | AWCM, BCCS, CMR5, ECE3, MEHR, MRM2 | SSP126, SSP245, SSP370, SSP585 | PECD/0.25° x 0.25° | hourly | BIAS, NUT0, NUT2, SZON, SZOF, PEON, PEOF | m s-1 |
Energy data
The same data illustrated in Section 2.7 are also used for the projection stream.
Energy Conversion models
Wind Power Conversion Model
The climate data used as input are listed in Table 3.4, and the procedure is the same as described in Section 2.9.1.
The simulated locations and wind technologies depend on the type of run. An overview of the runs is given in Table 3.5.
Anchor Table3_5 Table3_5
Table 3.5: Wind run types for the projection stream. Changes that were implemented in PECDv4.2 are highlighted in bold (specifically, the extended time period).
Run type | Climate projection simulated years | WPP locations | WPP technology | Losses |
Existing | 2015-2100 | All years with 2020 WPP locations (based on WindPowerNet data) | Existing WPP parameters based on WindPowerNet data (always 2020 fleet), applied in the generic power curve model | Wakes as part of the generic power curve. And 10 % for other losses (incl. unavailability), applied as a simple multiplication by 0.9 |
Future wind technologies | 2015-2100 | The best 10-50 % locations of the unmasked points within each PECD region (in terms of mean wind speed in the bias-adjusted ERA5 data, based on ERA5 grid). | Onshore wind: 3 hub heights and 3 turbine types, so in total 9 wind technologies. A plant of 50 MW with ten 5 MW turbines modelled for each technology. Offshore wind: 1 hub height and 2 turbine types, so in total 2 wind technologies. A plant of 500 MW with 28 18 MW turbines modelled for each technology. | Wakes as part of power curves. And 5 % for other losses (incl. unavailability), applied as a simple multiplication by 0.95 |
Photovoltaic Solar Power Conversion Model
The climate data used as input are listed in Table 3.4, and the procedure is the same as described in Section 2.9.2.
Concentrated Solar Power Conversion Model
The climate data used as input are listed in Table 3.4, and the procedure is the same as described in Section 2.9.3.
Hydro Power Conversion Model
The climate data used as input are listed in Table 3.4, and the procedure is the same as described in Section 2.9.4.
Energy indicators
For the projection stream, the same energy indicators described for the historical stream (see Section 2.10) werecomputed starting from the climate indicators listed in Table 3.4.
Table 3.6 summarizes the energy indicators and provides detailed information for each variable, including the type, the covered time period, the source of the input data, the domain and spatial resolution, the temporal resolution, the spatial aggregation (according to Table 2.1), and, where applicable, the different technologies used to compute the final time series.
...
**Inflow data from ENTSO-E PECDv3.1
Appendix
Anchor | ||||
---|---|---|---|---|
|
Filenames convention and characteristics
This paragraph aims to explain the filename convention of the PECD datasets. Table 4.1 details the structure and possible fields of the filenames. Specifically, the last column indicates the corresponding section of the CDS catalogue where users can personalize their choice. If "Not applicable" is indicated, it means that the user cannot modify this field, and the data are downloaded with fixed characteristics that are not customizable. Table 4.2 details the structure and filenames of the ancillary NetCDF files that have been used for PECDv4.2 and that are available in the CDS under the widget "Weights and masks".
...
Filename | Variable | Grid | Description | Corresponding name in the widget "Weights and masks" |
---|---|---|---|---|
ANCI_CITY-coords_PECD4.2_fv1.csv | - | - | List of cities and their corresponding coordinates. See Section 2.5.1 for more details. | City coordinates |
ANCI_LAT-mask_PECD4.2_fv1.nc | lat_weights(latitude, longitude) | PECD domain (latitude, longitude) | Each grid cell contains the cosine of the latitude for the corresponding grid cell. See Section 2.5.3 for more details. | Latitude weights |
ANCI_SZON-mask_PECD4.2_fv1.nc | mask(region, latitude, longitude) | PECD domain (latitude, longitude) level (region) | For each level (region in SZON), every grid cell contains a floating point value between 0 and 1. A value of 0 indicates that the grid cell is outside the region, while a value of 1 means the cell is fully within the region. In other cases, the value represents the fraction of the grid cell’s area that lies within the region. See Section 2.5.2 for more details. | SZON regions mask |
ANCI_SZOF-mask_PECD4.2_fv1.nc | mask(region, latitude, longitude) | PECD domain (latitude, longitude) level (region) | For each level (region in SZOF), every grid cell contains a floating point value between 0 and 1. A value of 0 indicates that the grid cell is outside the region, while a value of 1 means the cell is fully within the region. In other cases, the value represents the fraction of the grid cell’s area that lies within the region. See Section 2.5.2 for more details. | SZOF regions mask |
ANCI_PEON-mask_PECD4.2_fv1.nc | mask(region, latitude, longitude) | PECD domain (latitude, longitude) level (region) | For each level (region in PEON), every grid cell contains a floating point value between 0 and 1. A value of 0 indicates that the grid cell is outside the region, while a value of 1 means the cell is fully within the region. In other cases, the value represents the fraction of the grid cell’s area that lies within the region. See Section 2.5.2 for more details. | PEON regions mask |
ANCI_PEOF-mask_PECD4.2_fv1.nc | mask(region, latitude, longitude) | PECD domain (latitude, longitude) level (region) | For each level (region in PEOF), every grid cell contains a floating point value between 0 and 1. A value of 0 indicates that the grid cell is outside the region, while a value of 1 means the cell is fully within the region. In other cases, the value represents the fraction of the grid cell’s area that lies within the region. See Section 2.5.2 for more details. | PEOF regions mask |
ANCI_NUT0-mask_PECD4.2_fv1.nc | mask(region, latitude, longitude) | PECD domain (latitude, longitude) level (region) | For each level (region in NUT0), every grid cell contains a floating point value between 0 and 1. A value of 0 indicates that the grid cell is outside the region, while a value of 1 means the cell is fully within the region. In other cases, the value represents the fraction of the grid cell’s area that lies within the region. See Section 2.5.2 for more details. | NUTS 0 regions mask |
ANCI_NUT2-mask_PECD4.2_fv1.nc | mask(region, latitude, longitude) | PECD domain (latitude, longitude) level (region) | For each level (region in NUT2), every grid cell contains a floating point value between 0 and 1. A value of 0 indicates that the grid cell is outside the region, while a value of 1 means the cell is fully within the region. In other cases, the value represents the fraction of the grid cell’s area that lies within the region. See Section 2.5.2 for more details. | NUTS 2 regions mask |
ANCI_WPM-mask_PECD4.2_fv1.nc | m_rest(latitude, longitude) | PECD domain (latitude, longitude) | Each grid cell contains a boolean value: 1 indicates that the cell is unsuitable for potential future wind power installations, while 0 indicates that the cell could potentially be used as a site for such installations. See Section 2.8 for more details. | Wind power regions mask |
ANCI_PVM-mask_PECD4.2_fv1.nc | PVmask(latitude, longitude) | PECD domain (latitude, longitude) | Each grid cell contains a boolean value: 1 indicates that the cell is unsuitable for potential future solar photovoltaic power installations, while 0 indicates that the cell could potentially be used as a site for such installations. See Section 2.8 for more details. | Solar PV mask |
ANCI_ALP-coef_PECD4.2_fv1.nc | alpha(time, latitude, longitude) | PECD domain (latitude, longitude) levels (time) | For each level (time), every grid cell contains the power law's alpha coefficient. Each grid cell contains in total 12*24 alpha coefficients, one for each month of the year and each hour of the day. See Section 2.2.1 for more details. | Power law coefficients |
ANCI_POP-mask_PECD4.2_fv1.nc | population_mask(latitude, longitude) | PECD domain (latitude, longitude) | Each grid cell contains the number of people living in that area. See Section 2.4.1 for more details. | Population density mask |
ANCI_WS10G2-mean_PECD4.2_fv1.nc | ws10(latitude, longitude) | PECD domain (latitude, longitude) | Each grid cell contains the mean value of the 10 m wind speed from GWA2 computed over the period 2006-2018. This file is used in the bias adjustment procedure described in Section 2.3.3. | Climatology of GWA2 10 m wind speed |
ANCI_WS10E5-mean_PECD4.2_fv1.nc | ws10(latitude, longitude) | PECD domain (latitude, longitude) | Each grid cell contains the mean value of the 10 m wind speed from ERA5 computed over the period 2006-2018. This file is used in the bias adjustment procedure described in Section 2.3.3. | Climatology of ERA5 10 m wind speed |
ANCI_WS100G2-mean_PECD4.2_fv1.nc | ws100(latitude, longitude) | PECD domain (latitude, longitude) | Each grid cell contains the mean value of the 100 m wind speed from GWA2 computed over the period 2006-2018. This file is used in the bias adjustment procedure described in Section 2.3.3. | Climatology of GWA2 100 m wind speed |
ANCI_WS100E5-mean_PECD4.2_fv1.nc | ws100(latitude, longitude) | PECD domain (latitude, longitude) | Each grid cell contains the mean value of the 10 m wind speed from GWA2 computed over the period 2006-2018. This file is used in the bias adjustment procedure described in Section 2.3.3. | Climatology of ERA5 100 m wind speed |
Metadata
The header of CSV files contains the following metadata descriptors. Below, an example is presented for the 2m air temperature variable:
...
### The original data sources are ECMWF ERA5 Reanalysis (available at: https://cds.climate.copernicus.eu)
How to cite the data*
Please refer to the "References" section on the catalogue entry page of this dataset in the Climate Data Store (CDS) as it provides the DOI number as well as details on dataset citation and attribution.
References
Beyer, H. G., Heilscher, G., and Bofinger, S.: "A robust model for the MPP performance of different types of PV-modules applied for the performance check of grid connected systems”, EuroSun 2004 conference; pp. 3064-3071, Germany, June 2004.
...
Info | ||
---|---|---|
| ||
This document has been produced in the context of the Copernicus Climate Change Service (C3S). The activities leading to these results have been contracted by the European Centre for Medium-Range Weather Forecasts, operator of C3S on behalf of the European Union (Delegation Agreement signed on 11/11/2014 and Contribution Agreement signed on 22/07/2021). All information in this document is provided "as is" and no guarantee or warranty is given that the information is fit for any particular purpose. The users thereof use the information at their sole risk and liability. For the avoidance of all doubt, the European Commission and the European Centre for Medium - Range Weather Forecasts have no liability in respect of this document, which is merely representing the author's view. |
Related articles
Content by Label | ||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|