...
Atos HPCF and ECS computing platforms offer a wide range of software, libraries and tools. Let's go through some exercises to learn how to manage your software stack.
| Table of Contents | 
|---|
Basic software environment management
- You want to use CDO, a popular tool to manipulate climate and NWP model data. What do you need to do to get the following result? - No Format - $ cdo --version Climate Data Operators version X.Y.Z (https://mpimet.mpg.de/cdo) System: x86_64-pc-linux-gnu ... - Expand - title - Solution - If you run the command without any prior action, you may get: - No Format - $ cdo --version -bash: cdo: command not found - Many software packages and tools are not part of your default environment, and need to be explicitly loaded via modules. - So the following commands would be sufficient to get to the desired result: - No Format - module load cdo cdo --version - Tip - title - ml shortcut - You can also use the ml shortcut to load the module - No Format - ml cdo - Note that we did not ask for any specific version. In those cases, you will get the one defined as default. 
- How many versions of CDO can be used at ECMWF? Can you pick the newest? - Expand - title - Solution - There are hundreds of different packages with their corresponding different versions installed at ECMWF. You can use: - No Format - module avail - To see what modules can be loaded at any time. - However, not all modules can be loaded at any time, some will only become available if a certain combination of modules is loaded. - You can also use the following command for an overview or all the packages that are installed, including those that may not be visible in module avail: - No Format - module spider - In this case we are only interested in CDO so we can do either: - No Format - module avail cdo - or - No Format - module spider cdo - To load the newest, you can either explicitly pick up the latest version explicitly, so assuming that it was "X.Y.Z": - No Format - module load cdo/X.Y.Z - But you can also use the module tag "new": - No Format - module load cdo/new - or also ask for the latest with: - No Format - module --latest load cdo - Tip - title - No swap needed - If you had another version of the module loaded, the system will automatically swap it by the new one requested. 
- Load the - netcdf4module. Can you see what modules do you have loaded in your environment now?- Expand - title - Solution - To load the netcdf4 module just do: - No Format - module load netcdf4 - Then, you can see what your software environment looks like with: - No Format - module list - or with just the shortcut: - No Format - ml - You should see both the CDO and netcdf4, beside the default modules loaded in your environment. 
- Remove the - netcdf4module from your environment and check it is gone.- Expand - title - Solution - To unload the - netcdf4module just do:- No Format - module unload netcdf4 - or with just the shortcut: - No Format - ml -netcdf4 - Then, you can see what your software environment looks like with: - No Format - module list 
- Can you check what is the installation directory of the default netCDF4 library? - Expand - title - Solution - All modules at ECMWF will define a <PACKAGE_NAME>_DIR environment variable that can be useful to pass to configuration fiiles or scripts. Packages providing libraries such as netCDF4 will also typically define - <PACKAGE_NAME>_LIBand- <PACKAGE_NAME>_INCLUDE.- You can check the values of all those variables that a module would define without loading it running: - No Format - module show netcdf4 - or with just the shortcut: - No Format - ml show netcdf4 - You can then spot there the value of - NETCDF4_DIRpointing to- /usr/local/apps/netcdf4/X.Y.Z/COMPILER_FAMILY/COMPILER_VERSION
- Can you restore the default environment you had when you logged in? Check that the environment is back to the desired state. - Expand - title - Solution - If you log out of your session, next time you log in you will start with a fresh default environment. Modules are only loaded for that specific session. - However, if you don't want to log out, you can also reset your module environment with: - No Format - module reset - You can then check the effects with - No Format - module list - Tip - title - reset vs purge - There is a subtile difference between module reset and module purge. While the former will go back the default environment, which typically contains some default modules, the latter will completely unload all modules and leave you with a blank environment. 
ECMWF tools
| Info | ||
|---|---|---|
| 
 | ||
- Can you run codes_info tool, which is part of ecCodes? - Expand - title - Solution - If you run the command without any prior action, you may get: - No Format - $ codes_info -bash: codes_info: command not found - ecCodes, along with other ECMWF tools such as Metview or Magics are bundled into the ECMWF toolbox. You need to load that module in order to access them: - No Format - module load ecmwf-toolbox 
- Can you see what versions of ECMWF software are part of that module? - Expand - title - Solution - You can use the help option in modules to get additional information from the module, which in the case of the - ecmwf-toolboxwill include the versions of all the packages in the bundle:- No Format - module help ecmwf-toolbox - or with just the shortcut: - No Format - ml help ecmwf-toolbox 
- Can you run the - ecflow_clientcommand and get the version?- Expand - title - Solution - ecFlow is not part of the - ecmwf-toolboxmodule. Since It has its own standalone module, you will need to load that separately:- No Format - module load ecflow - or with just the shortcut: - No Format - ml ecflow - Once the module is loaded, you can get the version with: - No Format - ecflow_client --version 
Python and Conda
| Info | ||
|---|---|---|
| 
 | ||
- Try to run the command below. Why does it fail? Can you make it work without installing pandas yourself? - No Format - $ python3 -c "import pandas as pd; print(pd.__version__)" Traceback (most recent call last): File "<string>", line 1, in <module> ModuleNotFoundError: No module named 'pandas' - Expand - title - Solution - The system Python 3 installation is very limited and does not come with many popular extra packages such as pandas. You may use the Python3 stack available in modules, which comes with almost 400 of those extra packages : - No Format - module load python3 - After that, if you repeat the command it should complete successfully and print pandas version. - No Format - python3 -c "import pandas as pd; print(pd.__version__)" 
- You need to use the latest version of pandas to run a given application. What can you do (without using conda)? - Expand - title - Solution - In that case you could use pip to install it yourself. However, installing it directly into your user environment is highly discouraged since it may interfere with other applications you may run or after default software updates on the system side. Instead, for small additions to the default environment it is much more robust to use a python virtual environment. - In this case, you may create a virtual environment based on the installations provided, and just add the new version of pandas: - No Format - module load python3 mkdir -p $PERM/venvs cd $PERM/venvs python3 -m venv --system-site-packages myvenv - Then you can activate it only when you need it with: - No Format - source $PERM/venvs/myenv/bin/activate - Note that we used - $PERM/venvsas the location of these virtual environments, but you may decide to put them in another location.- With the environment activated, you can now install the new version of pandas: - No Format - pip install -U pandas - Then you can rerun the version command to check you got the latest - No Format - python3 -c "import pandas as pd; print(pd.__version__)" - When you have finished with your environment, you can deactivate it with: - No Format - deactivate 
- You may also use conda to create your own software stack with python packages and beyond. In order to use conda, you can load the corresponding module: - No Format - module load conda - What happened? - Expand - title - Answer - While conda may be seen as a way to set up custom Python environments, it also manages software beyond that, installing other packages and libraries not necessarily related to Python itself. - Because those may conflict with the software made available through modules, loading the conda module effectively disables all the other modules that may be loaded in your environment. - You have seen how the module system may have disabled a number of modules. You can also check it by running: - No Format - module list - You would then need to install everything you need to run your application or workflow in your conda environment. - If you want to go back to the previous environment without conda but with all the other modules, the recommended way is to reset the environment and then load explicitly all the necessary modules again - No Format - module reset module load python3 
- Create your new conda environment with latest pandas in it. Check the version Hint: you can also use mamba to speed up the environment creation process - Expand - title - Solution - In that case you could use pip to install it yourself. However, installing it directly into your user environment is highly discouraged since it may interfere with other applications you may run or after default software updates on the system side. Instead, for small additions to the default environment it is much more robust to use a python virtual environment. - In this case, you may create a virtual environment based on the installations provided, and just add the new version of pandas: - No Format - mamba create -n mypandas -c conda-forge python pandas conda activate mypandas python3 -c "import pandas as pd; print(pd.__version__)" 
Using Containerised applications
| Info | ||
|---|---|---|
| 
 | ||
...