You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 14 Next »

This is the User guide for the Atos Sequana XH2000 HPCF, installed in ECMWF's data centre in Bologna. This platform provides both the HPCF (AA, AB, AC, AD complexes) and ECGATE services (ECS), which in the past had been on separate platforms.

Below you will find some basic information on the different parts of the system. Please click on the headers or links to get all the details for the given topic.

News Feed

2024-06-11 Update of Operating System to RHEL 8.8 on AC complex

We are in the process of updating the Operating System on all complexes of our Atos HPCF, from RedHat RHEL 8.6 to RHEL 8.8.

The default Member-State user complex AC will be updated on:

11 June 2024 from 08:00 UTC

2024-05-15 Change of default versions of ECMWF and third-party software packages

When?

The changes took place on Wednesday 15 May 2024 09:00 UTC

Do I need to do anything?

2024-04-03 Introducing the new ECMWF JupyterHub service

We are thrilled to announce the general availability of the new JupyterHub service at ECMWF, a new way of accessing computing resources at ECMWF in an interactive and modern way. With JupyterHub you can now spin up your JupyterLab sessions on multiple backends including HPCF and ECS, leveraging the computational resources available at ECMWF without leaving your browser.

2024-03-13 System session on Wednesday 13 March affecting work on the ECMWF Atos HPC

ECMWF scheduled a network system session to be held on Wednesday 13 March 2024 which impacted work on the Atos HPC. The session lasted for 4.5 hours from 12:00 UTC to 16:30 UTC.

During the session, there was NO login access to hpc-login and NO user batch jobs submitted from hpc-login or via hpc-batch could run. Batch jobs submitted before the session which were not expected to complete before the sessions started were queued until after the session finished. Time-Critical Option 1 and 2 workloads were not affected and continued to run during the session.

2023-11-22 Change of default versions of ECMWF software packages

When?

The changes will take place on Wednesday 22 November 2023 09:00 UTC

Do I need to do anything?

2023-05-31 Change of default versions of ECMWF and third-party software packages

When?

The changes will take place on Wednesday 31 May 2023 09:00 UTC

Do I need to do anything?

2023-03-27 Scratch automatic purge enabled

From  the automatic purge of unused files in SCRATCH is enforced. Any files that have not been accessed at any time in the previous 30 days will be automatically deleted. This purge will be conducted regularly, in order to keep the usage of this filesystem within optimal parameters.

SCRATCH is designed to hold temporary large files and to act as the main storage and working filesystem for your jobs and experiments input and output files, but not to keep data for long term.

2023-01-18 Improving the time and memory limit management for batch jobs

Explicit time limit honoured

From ECMWF will enforce killing jobs if they have reached their wall time if #SBATCH --time or command line option -–time was provided with the job.

Alternatively ECMWF accepts jobs without #SBATCH --time or command line option -–time and ECMWF will instead use average runtime of previous "similar" jobs by generating job tag based on user, job name, job geometry and job output path.

2022-12-07 Important change in the new Slurm on the Atos HPC

On Slurm on Atos AD complex was updated to version 22.05. Since AD has been the default cluster with hpc-login and hpc-batch being aliases for nodes on AD complex.

The same version of Slurm 22.05 has also been installed on AA and AB complexes and will be installed on AC complex on  . 

2022-11-30 Unavailability of AA Atos cluster due to system update

On   at 08 UTC AA, the default Atos cluster will became unavailable for essential Slurm and security updates. In preparation of this session:

  • The default Atos login/batch cluster has been changed to AD on  at 9 UTC
  • Batch work on AA will be drained and jobs scheduled to finish after at 06 UTC will be automatically redirected to other complexes.

  • No labels