This document describes the metrics required to fulfil C3S_34a Lot 1 Milestone 6.4.2 - "Metrics available", as discussed at the meeting on 12/06/2019 attended by Matt Pryor (CEDA), Andras Horanyi and Angel Lopez Alos (ECMWF).

Key Performance Indicators (KPIs)

It was agreed that as a baseline, the deployed metrics must be able to report on the KPIs for the project. Targets are omitted as they are not relevant for metrics collection.

  • KPI.S1 Availability of project ESGF nodes (Data and Index nodes)
  • KPI.S2 Availability of project portal
  • KPI.S3 Availability of Compute Node
  • KPI.S4 Data Node download/OpenDap usage, number of downloads/file accesses
  • KPI.S5 CP4CDS Search capability, number of datasets available through ESGF Search
  • KPI.S6 Compute capability, number of CPUs and memory available for CP4CDS processing
  • KPI.S7 Processes ported into Compute Node, number of WPS processes available
  • KPI.S8 User uptake, number of users
  • KPI.S9 Evolution of downloads / Volume downloaded
  • KPI.S10 Download performance
  • KPI.S11 Search performance

However, some of these KPIs are point-in-time values at the point that project reports are due rather than metrics that require collection. Some are also ambiguous and require clarification. We discuss each KPI in turn here.

KPI.S1 Availability of project ESGF nodes

The system needs to be probed for availability at regular intervals. For the index node, "available" is defined as "system is up and able to successfully respond to a representative search query".  For the data node, "available" is defined as "system is up and able to successfully serve an authenticated data download". It was agreed that testing a small sample of datasets on each probe would be sufficient for this metric as testing all datasets on each probe is not feasible. The sample could be random, and so be different for each probe to give more coverage. Ideally, availability would be reported per-site and for the system as a whole.

KPI.S2 Availability of project portal

It was agreed that this the same as KPI.S1, so this is formally merged with the previous KPI.

KPI.S3 Availability of Compute Node

Similar to KPI.S1, the system needs to be probed for availability at regular intervals. For the compute node, "available" is defined as "system is up and available to receive and process a job". As for KPI.S1, ideally availability would be reported per-site and for the system as a whole.

KPI.S4 Data Node usage

This is monitored by the CDS. It was agreed that this does not need to be collected to fulfil the milestone.

KPI.S5 CP4CDS Search capability

It was agreed that a static figure at the time of reporting is sufficient to fulfil the milestone.

KPI.S6 Compute capability

It was agreed that a static figure at the time of reporting is sufficient to fulfil the milestone.

KPI.S7 Processes ported into Compute Node

It was agreed that providing a list of available WPS processes at the time of reporting is sufficient to fulfil the milestone. ECMWF requested to be notified whenever a new process is added.

KPI.S8 User uptake

This is monitored by the CDS. It was agreed that this does not need to be collected to fulfil the milestone.

KPI.S9 Evolution of downloads / Volume downloaded

This is monitored by the CDS. It was agreed that this does not need to be collected to fulfil the milestone. (KPI.S4, KPI.S8 and KPI.S9 are all monitored by the CDS team.)

KPI.S10 Download performance

The CP4CDS team are not aware of any methods to measure the network performance between "end users" (the CDS in our case) and the CP4CDS infrastructure. However, the ECMWF network team measure network performance between the CDS and end users. ECMWF agreed to share information about the tools used to achieve this, which can be integrated into the metrics collection.

KPI.S11 Search performance

It was agreed that this is the same as KPI.S1. (Formally KPI.S1, KPI.S2 and KPI.S11 are merged into one KPI.)


Additional metrics

A number of additional metrics were also discussed. It was agreed that none of these would be required to fulfil the milestone, but any effort to collect them would be appreciated.

  • Runtime of compute node jobs
  • Utilisation of compute node resources
  • Request duration for data/index nodes


Required metrics for milestone fulfilment

  • Data node availability, as defined above
  • Index node availability as defined above
  • Compute node availability as defined above
  • Processes ported into Compute Node
  • Download performance, subject to ECMWF sharing methods


  • No labels