Last modified on Feb 18, 2026 16:52

Table of Contents

Hydropower energy conversion modelling in this framework is implemented through two complementary approaches: a statistical machine learning model for Europe and a global proxy indicator model. The European model uses detailed, high-frequency historical generation data to train a Random Forest (RF) regression model at country level, while the global approach applies a precipitation-based proxy known as Installed capacity Weighted Precipitation (IWP), using plant location and installed capacity to derive an indicator proportional to hydropower generation. Both approaches provide country-level (ADM0) outputs, delivered as CSV datasets.

Random Forest Regression Model (European Domain)

The statistical model for the European domain is a Random Forest regression (RF) model (Pedregosa et al., 2011), a machine learning model based on ensemble learning, which already proved to work well at such a resolution and broad domain in a previous study by Ho et al. (2020).

It is trained on hourly or sub-hourly generation data from the ENTSO-E Transparency Platform. This model targets three hydropower indicators:

Inflow to reservoirs (HRI)
Generation from reservoirs (HRG)
Generation from run-of-river and pondage (HRO)

These indicators are all measured in MWh.

Input Data

Observed hydropower data from the ENTSO-E Transparency Platform: hourly generation (15/30/60 min resolution), annual installed capacity (IC), and weekly reservoir filling rates (FR)
Climate variables: 2 m temperature (TA) and total precipitation (TP), both aggregated weekly and spatially at country level (ADM0).

Pre-processing

Generation data is aggregated to weekly values if at least 80% of hourly values are present; otherwise, the week is discarded.
Temperature is averaged and precipitation summed over a set of time lags (up to 15 and 30 weeks, respectively), to reflect the delayed influence on hydropower systems.
Initial data checks and corrections are applied to eliminate unphysical spikes, e.g., early artefacts in FR series.

Inflow Computation

Inflow to reservoirs is estimated using the weekly generation and change in reservoir filling rate:

\[ \text{Inflow}(w) = \frac{\text{GEN_OUT}(w)}{\eta_p} + \left[\text{FR}(w) - \text{FR}(w-1)\right] \]

Where:

\( \text{Inflow(w)} \) : estimated energy inflow to the reservoir during week w
\( \text{GEN_OUT}(w) \) : generation output during week w
\( \eta_p \) : production efficiency (dimensionless)
\( \text{FR}(w) \) : filling rate of the reservoir at week w (in MWh)
\( \text{FR}(w-1) \) : filling rate of the reservoir at week w-1 (in MWh)

Negative inflow values (e.g., during dry periods) are set to zero, following ENTSO-E recommendations.

Model Training, Performance and Tuning

Training

The RF model is trained separately for each country and indicator. Parameters such as the number of trees (n_estimators), maximum depth (max_depth), and minimum samples per leaf (min_samples_leaf) are optimised using Latin Hypercube Sampling across 1000 combinations. Validation is performed using a Leave-One-Year-Out (LOYO) strategy (see Figure 1.1).

Figure 1.1: Example of inflow to reservoirs timeseries estimated (or predicted) using the LOYO procedure with a random forest regression model (red line), against the observations (grey line) - for France.

Performance

Performance is primarily evaluated using the Nash-Sutcliffe Efficiency (NSE), a widely used metric in hydrology to assess the skill of time series predictions.:

\[ \text{NSE} = 1 - \frac{\sum_{i=1}^{n} (x_m^i - x_o^i)^2}{\sum_{i=1}^{n} (x_o^i - \bar{x}_o)^2} \]

Where:

\( x^i_m \) : modelled weekly value
\( x^i_o \) : observed weekly value
\( \bar{x}^i_o \) : mean of observations.

An NSE value of 1 corresponds to a perfect match with observations, while 0 indicates that the model performs no better than the mean of the observed values.

See Figure 1.2 for some results of the NSE metric for each of the modelled indicators.

Overall, the results are promising, especially when considering the limited input information (only temperature and precipitation) and the coarse country-level resolution:

Inflow to reservoirs (HRI): This indicator shows strong performance, with NSE values typically between 0.5 and 0.8 where reservoir filling data are available. The inclusion of FR data helps reduce the effect of human intervention, making the signal more predictable from climate inputs.
Run-of-river generation (HRO): The model performs consistently well across most countries, benefiting from the seasonal and climate-driven nature of this indicator. High NSE values are observed in countries with dense and stable run-of-river infrastructure.
Reservoir generation (HRG): This indicator shows more variable results, especially in countries with abrupt changes in installed capacity or inconsistent reporting. In such cases, the model struggles to capture non-climatic influences.

As guidance:

NSE > 0.5: good performance and suitable for climate impact assessment.
0.2 < NSE < 0.5: moderate reliability; usable with reservations.
NSE < 0.2 or unavailable: low confidence; use only for exploratory purposes.

The model is not designed to extrapolate beyond the range of the training data, limiting its ability to capture extreme values.

Figure 1.2: Maps of validation results obtained in terms of NSE over the period 2015-2022.
The three panels each refer to a different indicator: inflow to reservoirs (HRI), generation from reservoirs (HRG) and generation from run-of-river and pondage (HRO). The countries with no data are hatched.

Tuning

After validation, the RF model is retrained using all available years of hydropower generation data from the ENTSO-E Transparency Platform, applying the optimal parameter sets identified during the hyperparameter tuning phase. These trained models are then used to reconstruct historical hydropower indicators back to 1950 using ERA5 climate data and to project them forward to 2100 using climate input from CMIP6 models.

Output Data

Table 1.1 lists the European countries for which hydropower indicators were successfully produced. Countries with less than two years of data for a given indicator could not be properly validated and were therefore excluded.

The final output consists of CSV files containing time series of the modelled hydropower indicators at country level (ADM0), available at multiple temporal resolutions—weekly, monthly, seasonal, and annual—following the Temporal Aggregation Procedure.

Please notice that the European hydropower indicators (HRG, HRI, HRO) are generated at a weekly resolution; to derive consistent monthly, seasonal, and annual statistics, the weekly values are first converted to daily by evenly distributing them across seven days—assuming constant production within each week—and then aggregated using standard temporal operations. The intermediate daily values, however, are not delivered, as they do not provide additional informational value beyond the original weekly dataset.

Table 1.1: List of European countries for which the hydropower indicators have been produced using the RF model. Please refer to Table 3.1 for the correspondences between ISO codes and full country names.

HP Indicator	Countries available
Generation from reservoirs (HRG)	AT, BA (low NSE), BG, CH, CZ (low NSE), DE, ES, FR, HR, HU (low NSE), IT, ME (low NSE), NO, PL, PT, RO, RS (low NSE), SE, SK
Inflow to reservoirs (HRI)	AT, BG, CH, ES, FR, HR, IT, ME, NO, PT, RO, RS, SE
Generation from run-of-river and pondage (HRO)	AT, BE, BG, CH, CZ (low NSE), DE, ES, FI, FR, HR (low NSE), HU (low NSE), IE, IT, LT, LV, MK (low NSE - data full of gaps), NO, PL, PT, RO, RS, SI, SK, GB

Installed Capacity Weighted Precipitation (IWP) Proxy (Global Domain)

In regions where detailed hydropower generation data is lacking, a proxy-based method—Installed capacity Weighted Precipitation (IWP)—is used to mimic hydropower generation variability. This approach allows for informative long-term time series for all countries, including Europe.

Input Data

Monthly precipitation data, aggregated at sub-country level (NUT2 for European countries, ADM1 for the rest).
Installed Capacity (IC) data from the Global Energy Monitor (GEM).
Monthly hydropower generation data from EMBER for validation.

Methodology

Regional Capacity Aggregation:
Hydropower plants (HPPs) are assigned to their respective NUT2/ADM1 regions using GEM data. For each region, the total installed capacity is computed. These regional capacities are then used to calculate each region’s weight in the country’s overall IC.
See Figure 2.1 for an example of regional capacity distribution.
Precipitation Processing
Precipitation data is:
- Aggregated at NUT2/ADM1 level
- Summed monthly
- Cumulated over a country-specific number of preceding months (n = 1–12), depending on the country's hydropower response characteristics
Calculation of the Installed Capacity Weighted Precipitation:
The cumulated precipitation for each region is multiplied by the regional IC weight. The country-level (ADM0) IWP value is then obtained as the sum of these contributions, normalised by the country’s total installed capacity.
This yields a time series of monthly proxy estimates for hydropower potential in each country.

Tuning and Validation

For countries with available observed generation data:

The IWP time series is compared against actual or modelled generation (e.g., Random Forest model output).
The IWP lag (n) is optimised by testing values from 1 to 12 months and selecting the one that maximises correlation or NSE.

Figure 2.2 shows the comparison between IWP and the RF model for Austria.
Figure 2.3 displays the optimal lag (number of months of cumulated precipitation) used in each country.

For countries without generation data, a default lag of 3 months is applied—this being the most common optimal value in evaluated countries.

IWP values are also compared with observed hydropower capacity factors from the EMBER dataset for selected countries (run-of-river and reservoir technologies combined).
See Figure 2.4 for example comparisons with EMBER data for China, Chile, and Australia.

Figure 2.1: Aggregated hydropower Installed Capacity for the ADM1 regions of the globe. The darker the region, the higher the influence of its precipitation on the country's IWP. Grey regions are regions with no hydropower installed capacity according to GEM.

Figure 2.2: Comparison between RF-estimated historical series of hydropower generation (sum of reservoirs and run-of-river and pondage contributions, in red) and the IWP series (in blue) for Austria (AT). The values are normalised, so they range from 0 to 1. The first panel shows the monthly time series for the two different approaches (taking a reduced time window for visibility purposes), while the second panel shows the annual mean values (mean over all months) for the entire simulated period (1950-2024).

Figure 2.3: Number of months over which precipitation is cumulated for the calculation of the IWP hydropower proxy. Note: only countries with available generation time series are shown; for the others, a default value of 3 months is applied.

Figure 2.4: Comparison between EMBER observed hydropower capacity factors (HP CF) data (run-of-river and reservoirs technologies together, in red) and the IWP proxy (in blue) for China (CN, first panel), Chile (CL, second panel), and Australia (AU, third panel). Data is normalized with min-max scaling.

Results and Output Data

IWP shows good alignment with RF results in Europe and captures interannual variability.
Performance is generally satisfactory across most regions, though lower for countries with sparse or highly localised HPP distributions (e.g., Australia).
Countries with no GEM-recorded hydropower plants are excluded from the IWP analysis.
Figure 2.5 shows the global map of mean annual IWP over 1991-2020, with excluded countries marked in white and diagonal hatching.
For completeness, the list of countries for which the IWP proxy is provided is also listed here:
AE, AF, AL, AM, AO, AR, AT, AU, AZ, BA, BD, BE, BG, BO, BR, BT, CA, CD, CG, CH, CI, CL, CM, CN, CO, CR, CZ, DE, DO, EC, EG, ES, ET, FI, FJ, FR, GA, GB, GE, GH, GN, GQ, GR, GT, HN, HR, ID, IE, IL, IN, IQ, IR, IS, IT, JP, KE, KG, KH, KP, KR, KZ, LA, LB, LK, LR, LT, LU, LV, MA, ME, MG, MK, ML, MM, MW, MX, MY, MZ, NA, NE, NG, NO, NP, NZ, PA, PE, PG, PH, PK, PL, PT, PY, RO, RS, RU, RW, SD, SE, SI, SK, SN, SR, SV, SY, TH, TJ, TR, TW, TZ, UA, UG, US, UY, UZ, VE, VN, ZA, ZM, ZW. Please refer to Table 3.1 for the correspondences between ISO codes and full country names.

The final output consists of CSV files containing time series of the IWP indicator at country level (ADM0), available at multiple temporal resolutions—monthly, seasonal, and annual—following the Temporal Aggregation Procedure.

Please notice that although IWP is computed on a monthly basis, the number of months over which precipitation is accumulated varies by country based on assumptions about local storage capacity; as a result, monthly values are expressed in mm per n-months, seasonal and annual aggregations are calculated as arithmetic means (not sums) to preserve unit consistency, and daily values are not provided, since disaggregating n-month accumulations to a daily scale would be artificial. Please refer to Table 2.1 for the value of n (number of months) used for each country.

Figure 2.5: Mean annual IWP map (1991–2020). Countries where no installed capacity data from the Global Energy Monitor, and hence no IWP is available, are shown in white with diagonal hatching.

Table 2.1: Correspondences between ISO codes and number of months (n) used as lag for each country in the IWP model. Countries for which n couldn't be optimised (not displayed in the Table) were assigned a standard n of 3.

ISO code	n
AR	10
AT	2
AU	9
BA	3
BD	4
BE	4
BG	7
BO	4
BR	3
CA	9
CH	3
CL	8
CN	3
CO	4
CR	3
CZ	12
DE	2
EC	2
EG	9
ES	5
FI	12
FR	5
GB	3
GR	2
HR	5
IE	5
IN	2
IR	9
IT	3
JP	2
KE	12
KR	2
LT	10
LU	3
LV	9
ME	2
MK	1
MX	1
NG	8
NO	4
NZ	3
PE	3
PH	6
PK	5
PL	11
PT	3
RO	2
RS	2
RU	12
SE	9
SI	1
SK	12
SV	1
TH	12
TR	7
TW	1
UA	2
US	7
UY	4
VN	4
ZA	1
IS	5

Appendix

Table 3.1: Correspondences between ISO codes and full country names.

Iso Code A2	Full Country Name
AE	United Arab Emirates
AF	Afghanistan
AL	Albania
AM	Armenia
AO	Angola
AR	Argentina
AT	Austria
AU	Australia
AZ	Azerbaijan
BA	Bosnia and Herzegovina
BD	Bangladesh
BE	Belgium
BG	Bulgaria
BO	Bolivia
BR	Brazil
BT	Bhutan
CA	Canada
CD	Democratic Republic of the Congo
CG	Congo
CH	Switzerland
CI	Cote DIvoire
CL	Chile
CM	Cameroon
CN	China
CO	Colombia
CR	Costa Rica
CZ	Czech Republic
DE	Germany
DO	Dominican Republic
EC	Ecuador
EG	Egypt
ES	Spain
ET	Ethiopia
FI	Finland
FJ	Fiji
FR	France
GA	Gabon
GB	United Kingdom
GE	Georgia
GH	Ghana
GN	Guinea
GQ	Equatorial Guinea
GR	Greece
GT	Guatemala
HN	Honduras
HR	Croatia
HU	Hungary
ID	Indonesia
IE	Ireland
IL	Israel
IN	India
IQ	Iraq
IR	Iran
IS	Iceland
IT	Italy
JP	Japan
KE	Kenya
KG	Kyrgyzstan
KH	Cambodia
KP	North Korea
KR	Korea
KZ	Kazakhstan
LA	Laos
LB	Lebanon
LK	Sri Lanka
LR	Liberia
LT	Lithuania
LU	Luxembourg
LV	Latvia
MA	Morocco
ME	Montenegro
MG	Madagascar
MK	North Macedonia
ML	Mali
MM	Myanmar
MW	Malawi
MX	Mexico
MY	Malaysia
MZ	Mozambique
NA	Namibia
NE	Niger
NG	Nigeria
NO	Norway
NP	Nepal
NZ	New Zealand
PA	Panama
PE	Peru
PG	Papua New Guinea
PH	Philippines
PK	Pakistan
PL	Poland
PT	Portugal
PY	Paraguay
RO	Romania
RS	Serbia
RU	Russia
RW	Rwanda
SD	Sudan
SE	Sweden
SI	Slovenia
SK	Slovakia
SN	Senegal
SR	Suriname
SV	El Salvador
SY	Syrian Arab Republic
TH	Thailand
TJ	Tajikistan
TR	Turkey
TW	Taiwan
TZ	Tanzania
UA	Ukraine
UG	Uganda
US	United States of America
UY	Uruguay
UZ	Uzbekistan
VE	Venezuela
VN	Vietnam
ZA	South Africa
ZM	Zambia
ZW	Zimbabwe

References

For the references, please refer to the References section in the Product User Guide.

_{This document has been produced in the context of the Copernicus Climate Change Service (C3S).}

_{The activities leading to these results have been contracted by the European Centre for Medium-Range Weather Forecasts, operator of C3S on behalf of the European Union (Delegation Agreement signed on 11/11/2014 and Contribution Agreement signed on 22/07/2021). All information in this document is provided "as is" and no guarantee or warranty is given that the information is fit for any particular purpose.}

_{The users thereof use the information at their sole risk and liability. For the avoidance of all doubt , the European Commission and the European Centre for Medium - Range Weather Forecasts have no liability in respect of this document, which is merely representing the author's view.}

Space shortcuts

Page tree

Random Forest Regression Model (European Domain)

Input Data