Due to an update required on the Lustre filesystem serving HPCPERM, SCRATCH as well as fws4, the ECMWF HPCF were unavailable to run any standard jobs on:

Wednesday 2025-03-05 from 08:30 to 16:30 UTC

During the session, queued jobs were kept on hold with the reason "Licenses", and started once the session is over. At the same time, there was a shorter, concurrent session on the AC cluster requiring a reboot of login, GPIL and GPU nodes. This session caused those nodes, including hpc-login and hpc-batch, to be unavailable between 08:30 - 10:30 UTC. Note that, while you might after 10:30 UTC, the Lustre filesystems mentioned above remained unavailable and any jobs submitted were on hold until the end of the longer session.

ECMWF Operations, Time-Critical Option 2 and 3 suites, as well as some prepIFS work continued to run normally since they use a separate set of filesystems and HPC complexes.

As usual, we encouraged you to follow the progress of this and other system sessions on the ECMWF Service Status.

You may get in touch with us for any concerns of doubts by raising an issue through our ECMWF Support Portal.

  • No labels