Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Tip

If you find any problem or any feature missing that you think should be present, and it is not listed here, please let us know  by reporting as a "Problem on computing" through the ECMWF Support Portal mentioning "Atos HPCF" in the summary.

Atos HPCF is not operational platform yet, and many features or elements may be gradually added as complete setup is finalised. Here is a list of the known limitations, missing features and issues.

Table of Contents

File Systems

...

  • HOME is in a temporary location (Lustre). It does not come with snapshots or backup.
  • PERM is temporarily supported by an area in lustre (no backups, no snapshots), but in the future will be provided by external NFS services (see above). Please use HPCPERM in the meantime, which would offer better parallel performance  - but no snapshots or backup.

...

Missing Features

Comprehensive software stack

...

End of job information

A basic report is provided at the end of the job with information about its execution.

No Format
## INFO[ECMWF-INFO -ecepilog] --------------------------------------------------------------------------------------------- 
## [ECMWF-INFO -ecepilog] This is the ECMWF job Epilogue.
[ECMWF-INFO -ecepilog] +++ Please report problems to ServiceDesk, servicedesk@ecmwf.int
## INFO issues using the Support portal +++
[ECMWF-INFO -ecepilog] +++ https://support.ecmwf.int                     +++
[ECMWF-INFO -ecepilog] ---------------------------------------------------------------------------------------------
## INFO
## INFO[ECMWF-INFO -ecepilog]
[ECMWF-INFO -ecepilog] Run at 2021-09-28T06:21:25 on aa
## INFO[ECMWF-INFO -ecepilog] Job Name                  : eci
## INFO[ECMWF-INFO -ecepilog] Job ID                    : 1009559
## INFO[ECMWF-INFO -ecepilog] Submitted                 : 2021-09-28T06:05:23
## INFO[ECMWF-INFO -ecepilog] Dispatched                : 2021-09-28T06:05:23
## INFO[ECMWF-INFO -ecepilog] Completed                 : 2021-09-28T06:21:25
## INFO[ECMWF-INFO -ecepilog] Waiting in the queue      : 0.0
## INFO[ECMWF-INFO -ecepilog] Runtime                   : 962
## INFO[ECMWF-INFO -ecepilog] Exit Code                 : 0:0
## INFO[ECMWF-INFO -ecepilog] State                     : COMPLETED
## INFO[ECMWF-INFO -ecepilog] Account                   : myaccount
## INFO[ECMWF-INFO -ecepilog] Queue                     : nf
## INFO[ECMWF-INFO -ecepilog] Owner                     : user
## INFO[ECMWF-INFO -ecepilog] STDOUT                    : slurm-1009559.out
## INFO[ECMWF-INFO -ecepilog] STDERR                    : slurm-1009559.out
## INFO[ECMWF-INFO -ecepilog] Nodes                     : 1
## INFO[ECMWF-INFO -ecepilog] Logical CPUs              : 8
## INFO[ECMWF-INFO -ecepilog] SBU                       : 20.460 units
## INFO

[ECMWF-INFO -ecepilog]


There is no charge made to the project accounts for any SBUs used on the Atos HPCF system until it becomes operational.
Warning
iconfalse
  • We are unable to provide a figure for the memory used at this time.

Alternatively, you may use sacct to get some of the statistics from SLURM once the job has finished.

Connectivity

  • Direct access to the Atos HPCF through ECACCESS or Teleport is not yet available. See HPC2020: How to connect for more information.
  • SSH connections to/from VMs in Reading running ecFlow servers are not available. For more details on ecFlow usage, see HPC2020: Using ecFlow.
  • Load balancing between Atos HPCF interactive login nodes is not ready yet. When implemented, an ssh connection into the main alias for the HPCF may create a session in an arbitrary login node.

Filesystems

PERM is temporarily supported by Lustre (no backups, no snapshots), but in the future will be provided by external NFS services. Once ready, the contents would be migrated without the user intervention.

The select/delete policy in SCRATCH has not been enforced yet.

See HPC2020: Filesystems for all the details.

Known issues

Intel MKL

...

greater than 19.0.5: performance issues on AMD chips

Recent versions of MKL do not use the AVX2 kernels for certain operations on non-intel chips, such as the AMD Rome on TEMSour HPCF. The consequence is a significant drop in performance.