Overview

The AI Weather Quest is a collaborative forecasting challenge hosted by ECMWF, designed to explore the potential of AI and machine learning in medium-range weather prediction. Participating teams are challenged with providing quintile-based probabilistic forecasts for three key variables: near-surface (2m) temperature (tas), mean sea level pressure (mslp), and accumulated precipitation (pr). Forecasts target lead times of days 19 to 25 (week 3) and days 26 to 32 (week 4).

The AI Weather Quest leaderboards page provides a dynamic view of how teams and models are performing throughout the competition. It displays ranked Probability Skill Scores (RPSS) for submitted forecasts, as well as team and model rankings, through leaderboards and evolution graphs. Evaluation results are only made available after the evaluation date (day 37 of each competition week) has passed, on Fridays at 00:00 UTC. 

This guide explains how to use the leaderboards page effectively and track the evolution of team and model performance over time.

Using the filters

Filters apply to all elements on the page, including RPSS leaderboards and evolution graphs. By default, the page displays the latest evaluated period and week, the first forecast window (Days 19–25), and variable-averaged scores.

To customize your view, use the following options:

1. Competitive period and week

    • Dropdown 1: select a competitive period (e.g. SON 2025)
    • Dropdown 2: select a competition week (e.g. Competition Week 1 (Thu 14-Aug-2025))

2. Forecast window

    • Days 19–25
    • Days 26–32

3. Variable

    • Variable-averaged (tas, mslp, pr)
    • Near-surface (2m) temperature (tas) 
    • Mean sea level pressure (mslp)
    • Precipitation (pr)

Understanding the RPSS leaderboards

All rankings are based on the RPSS of each team's best-performing model. However, individual model ranks and scores are also visualised, allowing users to compare performance across multiple submissions from the same team.

In the example bellow, the team CMAandFDU is ranked first because its model FengshunAdjust holds the top position. MicroEnsemble is ranked second, as its model StillLearning is placed third overall, the highest position among all models not belonging to CMAandFDU, which occupies both first and second place.

Click any RPSS score in the leaderboard to view the corresponding forecast in the ECMWF-hosted sub-seasonal AI forecasting portal. 

There are two main types of RPSS leaderboard (buttons before the leaderboards allow you to switch between the following two views):

1. Period-aggregated RPSS leaderboard

  • Shows RPSS scores aggregated over a selected competitive period (e.g. SON 2025)
  • Useful for assessing overall performance across multiple weeks
  • A team’s score will only appear in the Period-aggregated RPSS leaderboard if forecasts have been submitted up until the chosen competitive week.

2. Weekly RPSS leaderboard

  • Shows RPSS scores for a specific competition week
  • Useful for tracking short-term performance

For further details on how scores are calculated, see the evaluation system page of the AI Weather Quest website.

Exploring team profiles

Click on any team name in the leaderboard to view:

  • Team members (if public)
  • Model descriptions 
  • Participation history (forecast submissions by week, window, and variable)

Evolution graphs

Each RPSS table is accompanied by two evolution graphs:

1. Team rankings over time

Shows how team rankings evolve week by week:

  • X-axis: competition week numbers within the competitive period
  • Y-axis: team rankings based on RPSS of their best-performing model
  • Default view: top 5 teams for selected filters
  • Up to 10 teams can be selected at once
  • Hover over the nodes to see the team’s ranking for a specific week

2. Model RPSS scores over time

Tracks performance trends of models across weeks:

  • X-axis: competition week numbers
  • Y-axis: RPSS scores of models (please note, scores below -1 are not viewable)
  • Default view: the best model from each of the top 5 teams for selected filters
  • Up to 10 models can be selected at once
  • Hover over the nodes to see the model's score for a specific week.

Note: Line colours in the team and model graphs are assigned independently, so a team’s line colour will not necessarily match the colour of its models.

  • No labels