GloFAS verification activities are based on GloFAS fixed reporting points with additional quality criteria applied:
For reference, considering only the stations that do not have larger reservoir or lake influence, in v3.1 we had 1532 stations for the general hydrological model performance analysis. This number has increased to 1987 stations for v4.0 and 1949 stations that could be used (mapped) for both v4.0 and v3.1. The number is lower when both models are considered, as some stations could not be mapped in v3.1 due to the lower resolution (or other issues) of the river network. The generally lower number of 1532 originally used stations for the v3.1 analysis was partially due to the longer minimum observation length of 4 years, which was relaxed further to only 1 year for v4.0. Even though the scores will not be robust for very short availability period, which the users should be aware of, this way users can see the model behaviour with the flood thresholds and can have a first impression on the model performance. In addition, numbers also increased for v4.0, as extra stations were added to the GloFAS observation network since the v3.1 implementation in May 2021.
The station number of 1987, listed above, is lower than the 1995 stations used in the calibration (GloFAS v4 calibration methodology and parameters). The reason for this is that the calibration also considered good stations that had larger reservoir or lake influence.
The GloFAS v4 general hydrological model performance page shows three station networks. The largest set includes all stations that have at least the minimum 1-year good enough quality river discharge observations, regardless of the reservoir or lake influence, of which we have 2293. This station network was also used in the GloFAS hydrological model performance web product. In addition, another station network was used which included only the stations that were used in both v3 and v4 calibrations, and did not show large reservoir or lake influence. This list includes 996 stations. Then the third set was with stations that were not used in either v4 or v3 calibration, again without larger reservoir or lake influence. This includes 233 stations. The reason for the omission of stations with larger reservoir and lake influence is that these catchments are lot more difficult to model and sometimes improvements could come for the wrong reasons, which could make results comparing models, such as v4.0 vs v3.1, harder to interpret.
This GloFAS v4.0 hydrological model performance assessments (calibration and general performance and the evaluation web product) are based on the historical river discharge reanalysis, available at https://cds.climate.copernicus.eu/cdsapp#!/dataset/cems-glofas-historical?tab=overview.
The verification focused on the whole reanalysis period with all available river discharge observations over the period of 1979-2021. Both the general performance analysis (GloFAS v4 general hydrological model performance) and the GloFAS hydrological model performance layer in the map viewer (GloFAS hydrological model performance web product) used this period.
GloFAS hydrological performance verification is done against river discharge observations available to the GloFAS team. The hydrological model performance analysis was conducted based on the modified Kling–Gupta efficiency metric (KGE; ideal value is 1):
The three component scores of the KGE were also used:
In all the GloFAS v4 calibration hydrological model performance, the GloFAS v4 general hydrological model performance and the GloFAS hydrological model performance web product, the KGE' and the three KGE' components (bias, variability ratios and correlation) were used. For the calibration model performance (GloFAS v4 calibration hydrological model performance), the original β and γ ratio errors were considered. However, in the general GloFAS v4.0 model evaluation (GloFAS v4 general hydrological model performance) and the evaluation web product (GloFAS hydrological model performance web product), besides the correlation (pcorr), the β-1 (bias) and γ-1 (var) were considered to represent the bias and variability errors. These were chosen instead of β and γ as they have 0 as optimal values instead of 1 (with a range from -1 to infinity), and thus the sign intuitively shows whether the bias/variability is negatively or positively erroneous. In addition, the absolute values of bias (absbias) and var (absvar) were also used, which are useful in model comparison and can show the magnitude difference of these errors.
Finally, a specific index was also used for measuring timing errors (timing in days; ideal value is 0), which shows the time delay between the simulated and observed river discharge time series (and also the absolute value abstiming). Timing is time lag (or shift) L that maximises Rxy(L), cross correlation function Rxy(m) with the simulated (x) and observed (y) time series shifted by L days. Positive/negative timing error indicates delayed/advanced simulated river discharge. So, for example a timing error of +5 means the simulation needs to be shifted by 5 days backwards (brought earlier) to get to the highest correlation, i.e. the simulation is generally 5-day late predicting the ups and downs in the flow time series. Although this is not directly equivalent to measuring the timing error of the flood peaks, it is in very good relation with that and can be used as a simple estimate.