Atos HPCF and ECS computing platforms offer a wide range of software, libraries and tools. Let's go through some exercises to learn how to manage your software stack.
You want to use CDO, a popular tool to manipulate climate and NWP model data. What do you need to do to get the following result?
$ cdo --version Climate Data Operators version X.Y.Z (https://mpimet.mpg.de/cdo) System: x86_64-pc-linux-gnu ... |
If you run the command without any prior action, you may get:
Many software packages and tools are not part of your default environment, and need to be explicitly loaded via modules. So the following commands would be sufficient to get to the desired result:
Note that we did not ask for any specific version. In those cases, you will get the one defined as default. |
How many versions of CDO can be used at ECMWF? Can you pick the newest?
There are hundreds of different packages with their corresponding different versions installed at ECMWF. You can use:
To see what modules can be loaded at any time. However, not all modules can be loaded at any time, some will only become available if a certain combination of modules is loaded. You can also use the following command for an overview or all the packages that are installed, including those that may not be visible in module avail:
In this case we are only interested in CDO so we can do either:
or
To load the newest, you can either explicitly pick up the latest version explicitly, so assuming that it was "X.Y.Z":
But you can also use the module tag "new":
or also ask for the latest with:
|
Load the netcdf4 module. Can you see what modules do you have loaded in your environment now?
To load the netcdf4 module just do:
Then, you can see what your software environment looks like with:
or with just the shortcut:
You should see both the CDO and netcdf4, beside the default modules loaded in your environment. |
Remove the netcdf4 module from your environment and check it is gone.
To unload the
or with just the shortcut:
Then, you can see what your software environment looks like with:
|
Can you check what is the installation directory of the default netCDF4 library?
All modules at ECMWF will define a <PACKAGE_NAME>_DIR environment variable that can be useful to pass to configuration fiiles or scripts. Packages providing libraries such as netCDF4 will also typically define You can check the values of all those variables that a module would define without loading it running:
or with just the shortcut:
You can then spot there the value of |
Can you restore the default environment you had when you logged in? Check that the environment is back to the desired state.
If you log out of your session, next time you log in you will start with a fresh default environment. Modules are only loaded for that specific session. However, if you don't want to log out, you can also reset your module environment with:
You can then check the effects with
|
You want the git module to be loaded by default on every session and job on the Atos HPCF or ECS. How would you do that? Check that it works by opening a new session
You can use the
You can now open a new tab in your terminal and connect and open a new session on Atos HPCF or ECS. You should see the git module loaded when doing:
You may now remove the snippet you just added to the shell initialisation file. |
Can you run codes_info tool, which is part of ecCodes?
If you run the command without any prior action, you may get:
ecCodes, along with other ECMWF tools such as Metview or Magics are bundled into the ECMWF toolbox. You need to load that module in order to access them:
|
Can you see what versions of ECMWF software are part of that module?
You can use the help option in modules to get additional information from the module, which in the case of the
or with just the shortcut:
|
Can you run the ecflow_client command and get the version?
ecFlow is not part of the
or with just the shortcut:
Once the module is loaded, you can get the version with:
|
To ensure a default environment for the following exercises, reset your modules with:
module reset |
Try to run the command below. Why does it fail? Can you make it work without installing pandas yourself?
$ python3 -c "import pandas as pd; print(pd.__version__)" Traceback (most recent call last): File "<string>", line 1, in <module> ModuleNotFoundError: No module named 'pandas' |
The system Python 3 installation is very limited and does not come with many popular extra packages such as pandas. You may use the Python3 stack available in modules, which comes with almost 400 of those extra packages :
After that, if you repeat the command it should complete successfully and print pandas version.
|
Run the command below. It will try to check if you have a working setup for using Metview within Python:
python3 -m metview selfcheck |
Did it work? What do you need to do to get the following output?
$ python3 -m metview selfcheck Trying to connect to a Metview installation... Hello world - printed from Metview! Metview version X.Y.Z found Your system is ready. |
Certain Python extra packages which are bindings to non-python libraries and tools such as Metview, benefit from the existing installations on the system. You will need to ensure the appropriate modules are loaded in the system before running your Python code. In this case, since Metview is part of
|
What do you need to do to make Python use the latest version of Metview available on the system?
Just ensure you have the latest
|
You need to use the latest version of geojson python package to run a given application. At the same time you'd need to benefit the central python3 system installation. What can you do ?
In that case you could use pip to install it yourself. However, installing it directly into your user environment is highly discouraged since it may interfere with other applications you may run or after default software updates on the system side. Instead, for small additions to the default environment it is much more robust to use a python virtual environment. In this case, you may create a virtual environment based on the installations provided, and just add the latest version of
Then you can activate it only when you need it with:
Note that we used With the environment activated, you can now install the geojson package:
Then you can rerun the version command to check you got the latest
When you have finished with your environment, you can deactivate it with:
|
You may also use Tykky to create your own Python-based or virtual environments. In order to use Tykky, you can load the corresponding module:
module load tykky |
What happened?
Loading tykky will unload all the modules already loaded and will prevent you from loading new ones. This is done to avoid potential conflicts between packages installed in your environments and those from the module system (eccodes, ecflow, compilers, etc.). You have seen how the module system may have disabled a number of modules. You can also check it by running:
If you want to go back to the previous environment, the recommended way is to reset the environment and then load explicitly all the necessary modules again
|
Create a containerised conda installation using Tykky based on the following environment "env.yml" file content including pandas :
name: mypandas channels: - conda-forge dependencies: - python=3.12 - pandas |
Once installed, load the containerised environment and verify if pandas is available by checking its version.
Hint: since the installation require resources it is recommended to do it within a ecinteractive session, e.g. requesting 4 CPUs and 10GB RAM
Start first an ecinteractive session as recommended:
or equivalently for ECS :
Create the environment yaml file "env.yml" with the provided content:
then you could load the
continue creating the Tykky containerised environment based on the environment yaml:
after creation you can activate the Tykky environment and check
|
To ensure a default environment for the following exercise, reset your modules with:
module reset |
The default psql command, part of the PostgreSQL package is not up to date. You need to run the latest version, but you do not want to build it from source. A possible solution is to use a containerised version of this application. Can you run this on Atos HPCF or ECS?
You can use Apptainer to run docker or any OCI-compatible container images. We can use the official postgres container image from DockerHub:
You can also download the image and run it directly later with:
|