|How to cite the GloFAS-ERA5 reanalysis dataset and evaluation?|
Full details on the GloFAS-ERA5 river discharge reanalysis dataset and its evaluation can be found in:
Harrigan, S., Zsoter, E., Alfieri, L., Prudhomme, C., Salamon, P., Wetterhall, F., Barnard, C., Cloke, H., and Pappenberger, F. (2019) GloFAS-ERA5 operational global river discharge reanalysis 1979-present, in prep.
An evaluation of GloFAS discharge reanalysis forced by ERA5 is presented here. The GloFASv2.1 discharge reanalysis was produced from 1979 and is updated within 5 days of real time at a daily time-step for the 0.1° (~11 km at the equator) gridded GloFAS river network, forced with the official ERA5 atmospheric reanalysis (Figure 1). The performance of the discharge reanalysis product was compared against a global network of discharge observations. River catchments with at least four years of available data between 1979-2018 and an upstream area of at least 500 km2 were selected for the analysis. This resulted in 1801 catchments draining areas ranging between 575 km2 to 4,664,200 km2, (data extracted from the GloFAS observation database managed by the EU Joint Research Centres (JRC) as of 25 January 2019). Care must be taken in interpretation of results as the observation network is sparse in many parts of the world, particularly in Africa and Asia.
Figure 1: Mean GloFAS v2.1 daily river discharge over 1979 to 2018 for each GloFAS river grid cell with an upstream area greater than 1000 km2. Darker blue river sections have larger river discharge.
The full details on the analysis are given in the accompanying publication, Harrigan et al. (2019), but a summary is provided here to aid interpretation of the below figures. Performance of the discharge reanalysis was assessed against observations (at the 1801 stations selected) using the modified Kling-Gupta Efficiency metric (KGEmod: Gupta et al., 2009; Kling et al., 2012). The KGEmod is a widely used performance metric in hydrology, and is analogous to a normalisation of the Mean Squared Error (MSE). It can be decomposed into three components that are important for assessing hydrological dynamics: temporal errors through correlation, bias errors, and variability errors. The hydrological simulation skill of the GloFAS-ERA5 reanalysis was computed by comparing the KGEmod for the reanalysis (with respect to observations) against the KGEmod from a simple mean flow benchmark (after Knoben et al., 2019), and we refer to it as the KGE Skill Score (KGESS). A value of KGESS = 0 means the GloFAS reanalysis is no better than the mean flow benchmark so has no skill, KGESS > 0 for when the reanalysis is considered skillful, and KGESS < 0 for when performance is worse than the benchmark so is considered to have negative skill.
Results for overall performance show that the GloFAS discharge reanalysis is skillful in 86% of catchments (Figure 2). The global median KGESS is 0.51 ranging between 0.2 and 0.66 (Interquartile range (IQR)). Performance is best in Brazil (particularly the Amazon basin), central Europe, and eastern and western regions of the US. GloFAS reanalysis performance is poor (i.e. KGESS < 0) in many catchments in Africa, the North American Great Plains extending into Mexico, with notable patches in north eastern Brazil, Thailand, and southern Spain.
Figure 2: Modified Kling-Gupta Efficiency Skill Score (KGESS) for GloFAS river discharge reanalysis against 1801 observation stations. Optimum value of KGESS is 1. Blue (red) dots show catchments with positive (negative) skill.
Decomposition into correlation, bias, and variability
An advantage with the KGE is that it can be decomposed into three constituent components so that greater insight into what aspects of the GloFAS reanalysis are driving poor/good skill can be gained. The vast majority of catchments (99%) show positive correlation (Figure 3a) with a global median Pearson correlation coefficient of of 0.61 (0.44, 0.74). Figure 3b shows that discharge reanalysis is negatively biased in 64% of catchments (i.e. bias ratio < 1) with global median bias (as percentage) of -16% (-38%, 21%). However, it is clear that the worst performing catchments (dark red dots in Figure 2) are predominately driven by very large positive biases (dark blue dots in Figure 3b); in total 12% of catchments have positive biases of > 100% (i.e. bias ratio > 2). Figure 3c shows the variability of reanalysis time-series is lower than that of the observation time-series in almost 60% of catchments (i.e. variability ratio < 1) but errors in variability are less severe than bias errors with global median values (as percentage) of -9% (-31%, 15%).
Figure 3: Decomposition of the Modified Kling-Gupta Efficiency KGE’ into its three components, Pearson correlation (a), bias ratio (b), and variability ratio (c) for GloFAS v2.1 river discharge reanalysis against 1801 observation stations. Optimum values for all three components is 1. Catchments with positive (negative) values are shown by blue (red) dots.
It is important to also look at the average magnitude of errors as a small over/under estimation in dry rivers can produce large percentage biases (and hence bias ratios). This was done by converting the units of both the reanalysis and observation time-series from m3 s-1 to runoff depth across the catchment area in mm d-1 to allow direct comparison between catchments of different sizes, then compute the Mean Absolute Error (MAE) metric (Figure 4). Most areas with a PBIAS > 100 % (in Figure 3b), namely much of Africa, central US, and eastern Brazil, have in fact a low absolute magnitude of errors given their dry locations. Other notable areas with low absolute magnitude of errors include large parts of India, South East Asia, and Australia. There are however catchments in the western coast of South America, Sudan and Ethiopia, and tributaries of the River Ganges with a large MAE.
Figure 4: Mean Absolute Error (MAE) for GloFAS v2.1 reanalysis against 1801 observation stations. Units for both reanalysis and observations have been converted from m3 s-1 to runoff depth across the catchment area (mm d-1) to allow direct comparison of the magnitude of errors. Optimum value of MAE is 0, catchments with larger magnitude of errors are darker shades of blue dots.
Performance by month
Figures 5 shows the performance of GloFAS reanalysis for each month for all 1801 catchments. Hydrological simulation skill is relatively consistent across each month with median KGESS ranging between 0.32 to 0.41 (Figure 5a). The April to October months have highest skill, with January, February, March, November, and December having a higher proportion of catchments with negative skill. When the KGEmod is decomposed into correlation, bias, and variability components at the monthly scale it is shown that the months with higher incidence of negative KGESS is driven by more catchments with large positive biases in those months (Figure 5c). Monthly correlation (Figure 5b) and variability error metrics (Figure 5d) are much more consistent compared to bias errors.
Figure 5: Performance metrics for each month. Modified Kling-Gupta Efficiency Skill Score (KGESS) (a) with decomposition of KGE’ into Pearson correlation (b), bias ratio (c), and variability ratio (d). Boxes represent the IQR and horizontal grey line the median. whiskers extend to the most extreme data point, which is no more than 1.5 times the IQR from the box, and grey diamonds are outliers beyond this range.
Performance by catchment area
The skill of GloFAS reanalysis by catchment area is shown in Figure 6 grouped into seven categories (i.e. 500-2,500 km2, 2,500-5,000 km2, ... , 500,000 km2 or larger). In general, skill is lowest for catchments in the three categories < 10,000 km2 with median KGESS = 0.21 (n=39), 0.4 (n=41), and 0.42 (n=53) for catchments smaller than 2,500 km2, 5,000 km2 and 10,000 km2 respectively. Performance improves as catchment size increases, with median KGESS = 0.56 for catchments > 50,000 km2. It must be noted that results will be biased by uneven samples of catchment sizes available within our observations database. Catchments between 10,000 and 50,000 km2 are dominant (n=1013) and there is an underrepresentation of smaller catchments.
Figure 6: Modified Kling-Gupta Efficiency Skill Score (KGESS) grouped by seven catchment area categories. Boxes represent the IQR and horizontal grey line the median. Boxes represent the IQR and horizontal grey line the median. whiskers extend to the most extreme data point, which is no more than 1.5 times the IQR from the box, and grey diamonds are outliers beyond this range.
Gupta, H. V., H. Kling, K. K. Yilmaz, and G. F. Martinez, 2009: Decomposition of the mean squared error and NSE performance criteria: Implications for improving hydrological modelling. Journal of Hydrology, 377, 80–91, doi:10.1016/j.jhydrol.2009.08.003.
Harrigan, S., Zsoter, E., Alfieri, L., Prudhomme, C., Salamon, P., Wetterhall, F., Barnard, C., Cloke, H., and Pappenberger, F. (2019) GloFAS v2.1 operational global river discharge reanalysis 1979-present, in prep.
Kling, H., M. Fuchs, and M. Paulin, 2012: Runoff conditions in the upper Danube basin under an ensemble of climate change scenarios. Journal of Hydrology, 424–425, 264–277, doi:10.1016/j.jhydrol.2012.01.011 .
Knoben, W. J. M., Freer, J. E., and Woods, R. A.: Technical note: Inherent benchmark or not? Comparing Nash–Sutcliffe and Kling–Gupta efficiency scores, Hydrol. Earth Syst. Sci., 23, 4323–4331, https://doi.org/10.5194/hess-23-4323-2019, 2019.