Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Info
titleReference documentation

HPC2020: Python support

  1. To ensure a default environment for the following exercises, reset your modules with:

    No Format
    module reset


  2. Try to run the command below. Why does it fail? Can you make it work without installing pandas yourself?

    No Format
    $ python3 -c "import pandas as pd; print(pd.__version__)"
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
    ModuleNotFoundError: No module named 'pandas'


    Expand
    titleSolution

    The system Python 3 installation is very limited and does not come with many popular extra packages such as pandas. You may use the Python3 stack available in modules, which comes with almost 400 of those extra packages :

    No Format
    module load python3

    After that, if you repeat the command it should complete successfully and print pandas version.

    No Format
    python3 -c "import pandas as pd; print(pd.__version__)"



  3. Run the command below. It will try to check if you have a working setup for using Metview within Python:

    No Format
    python3 -m metview selfcheck


    1. Did it work? What do you need to do to get the following output?

      No Format
      $ python3 -m metview selfcheck
      Trying to connect to a Metview installation...
      Hello world - printed from Metview!
      Metview version X.Y.Z found
      Your system is ready.


      Expand
      titleSolution

      Certain Python extra packages which are bindings to non-python libraries and tools such as Metview, benefit from the existing installations on the system. You will need to ensure the appropriate modules are loaded in the system before running your Python code. In this case, since Metview is part of ecmwf-toolbox module:

      No Format
      module load ecmwf-toolbox
      python3 -m metview selfcheck



    2. What do you need to do to make Python use the latest version of Metview available on the system? 

      Expand
      titleSolution

      Just ensure you have the latest ecmwf-toolbox loaded :

      No Format
      module load --latest ecmwf-toolbox
      python3 -m metview selfcheck



  4. You need to use the latest version of pandas to run a given application. What can you do (without using conda)?

    Expand
    titleSolution

    In that case you could use pip to install it yourself. However, installing it directly into your user environment is highly discouraged since it may interfere with other applications you may run or after default software updates on the system side. Instead, for small additions to the default environment it is much more robust to use a python virtual environment.

    In this case, you may create a virtual environment based on the installations provided, and just add the new version of pandas:

    No Format
    module load python3
    mkdir -p $PERM/venvs
    cd $PERM/venvs
    python3 -m venv --system-site-packages myvenv

    Then you can activate it only when you need it with:

    No Format
    source $PERM/venvs/myenv/bin/activate

    Note that we used $PERM/venvs as the location of these virtual environments, but you may decide to put them in another location. 

    With the environment activated, you can now install the new version of pandas:

    No Format
    pip install -U pandas

    Then you can rerun the version command to check you got the latest

    No Format
    python3 -c "import pandas as pd; print(pd.__version__)"

    When you have finished with your environment, you can deactivate it with:

    No Format
    deactivate



  5. You may also use conda to create your own software stack with python packages and beyond. In order to use conda, you can load the corresponding module:

    No Format
    module load conda

    What happened?

    Expand
    titleAnswer

    While conda may be seen as a way to set up custom Python environments, it also manages software beyond that, installing other packages and libraries not necessarily related to Python itself.

    Because those may conflict with the software made available through modules, loading the conda module effectively disables all the other modules that may be loaded in your environment.

    You have seen how the module system may have disabled a number of modules. You can also check it by running:

    No Format
    module list

    You would then need to install everything you need to run your application or workflow in your conda environment.

    If you want to go back to the previous environment without conda but with all the other modules, the recommended way is to reset the environment and then load explicitly all the necessary modules again

    No Format
    module reset
    module load python3



  6. Create your new conda environment with latest pandas in it. Check the version Hint: you can also use mamba to speed up the environment creation process

    Expand
    titleSolution

    In that case you could use pip to install it yourself. However, installing it directly into your user environment is highly discouraged since it may interfere with other applications you may run or after default software updates on the system side. Instead, for small additions to the default environment it is much more robust to use a python virtual environment.

    In this case, you may create a virtual environment based on the installations provided, and just add the new version of pandas:

    No Format
    mamba create -n mypandas -c conda-forge python pandas
    conda activate mypandas
    python3 -c "import pandas as pd; print(pd.__version__)"



...