Question

I'm handling large data files and/or performing computations that require more memory or disk space than I have available on Atos - what can I do?

Answer

Debug info

First, make sure you can see Metview's error messages. If running a macro, or using the graphical user interface, make sure to start Metview with the -slog option (sends all output to standard output). If running a Python script, set environment variable METVIEW_PYTHON_DEBUG=1 before starting Python or add this line before importing the 'metview' module:

os.environ["METVIEW_PYTHON_DEBUG"] = "1"

Temporary disk space

If you see an error message similar to the following, please see A Metview session on Atos has run out of tmp space - Metview FAQ

/etc/ecmwf/ssd/ssd1/tmpdirs/xxx.34391055/mv.211486.xxx/config: No space left on device

GRIB field larger than default buffer size

If you see an error message similar to the following:

wmo_read_any_from_file: error -3 (Passed buffer is too small) l=140726296292856, len=3732480187

then at least one your your GRIB fields is very large, and so you should set this buffer to the size indicated by the second number, or something a little larger, e.g.

export MARS_READANY_BUFFER_SIZE=3732480200

Do this before starting Metview, or before importing the 'metview' module (see the example for setting METVIEW_PYTHON_DEBUG, above).

Process running out of memory

If you get a memkill, you have some options to give more memory to your process - consider one of the following:

  1. ensure that your ecinteractive shell has enough memory (current limit of 32GB), e.g. 

    ecinteractive -c8 -m32GB -s8GB
  2. run your job in batch, which enables more memory (see HPC2020: Batch system] - this involves writing a short shell script in which you specify the amount of memory your job needs, e.g.
    myjob.sh
    #!/bin/bash
    #SBATCH --mem-per-cpu=64G
    
    module load ecmwf-toolbox/new
    
    metview -slog -b mytask.mv

    Then use 'sbatch' to run this script:

    sbatch ./myjob.sh