Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

  1. To ensure a default environment for the following exercises, reset your modules with:

    No Format
    module reset


  2. Try to run the command below. Why does it fail? Can you make it work without installing pandas yourself?

    No Format
    $ python3 -c "import pandas as pd; print(pd.__version__)"
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
    ModuleNotFoundError: No module named 'pandas'


    Expand
    titleSolution

    The system Python 3 installation is very limited and does not come with many popular extra packages such as pandas. You may use the Python3 stack available in modules, which comes with almost 400 of those extra packages :

    No Format
    module load python3

    After that, if you repeat the command it should complete successfully and print pandas version.

    No Format
    python3 -c "import pandas as pd; print(pd.__version__)"



  3. Run the command below. It will try to check if you have a working setup for using Metview within Python:

    No Format
    python3 -m metview selfcheck


    1. Did it work? What do you need to do to get the following output?

      No Format
      $ python3 -m metview selfcheck
      Trying to connect to a Metview installation...
      Hello world - printed from Metview!
      Metview version X.Y.Z found
      Your system is ready.


      Expand
      titleSolution

      Certain Python extra packages which are bindings to non-python libraries and tools such as Metview, benefit from the existing installations on the system. You will need to ensure the appropriate modules are loaded in the system before running your Python code. In this case, since Metview is part of ecmwf-toolbox module:

      No Format
      module load ecmwf-toolbox
      python3 -m metview selfcheck



    2. What do you need to do to make Python use the latest version of Metview available on the system? 

      Expand
      titleSolution

      Just ensure you have the latest ecmwf-toolbox loaded :

      No Format
      module --latest load ecmwf-toolbox
      python3 -m metview selfcheck



  4. You need to use the latest version of geojson python package to run a given application. At the same time you'd need to benefit the central python3 system installation. What can you do ?

    Expand
    titleSolution

    In that case you could use pip to install it yourself. However, installing it directly into your user environment is highly discouraged since it may interfere with other applications you may run or after default software updates on the system side. Instead, for small additions to the default environment it is much more robust to use a python virtual environment.

    In this case, you may create a virtual environment based on the installations provided, and just add the latest version of geojson:

    No Format
    module load python3
    mkdir -p $HPCPERM/venvs
    cd $HPCPERM/venvs
    python3 -m venv --system-site-packages myvenv

    Then you can activate it only when you need it with:

    No Format
    source $PERM$HPCPERM/venvs/myenv/bin/activate

    Note that we used $PERM$HPCPERM/venvs as the location of these virtual environments, but you may decide to put them in another location. 

    With the environment activated, you can now install the geojson package:

    No Format
    pip install -U geojson

    Then you can rerun the version command to check you got the latest

    No Format
    python3 -c "import geojson; print(geojson.__version__)"

    When you have finished with your environment, you can deactivate it with:

    No Format
    deactivate



  5. You may also use Tykky to create your own Python-based or virtual environments. In order to use Tykky, you can load the corresponding module:

    No Format
    module load tykky

    What happened?

    Expand
    titleAnswer

    Loading tykky will unload all the modules already loaded and will prevent you from loading new ones.

    This is done to avoid potential conflicts between packages installed in your environments and those from the module system (eccodes, ecflow, compilers, etc.).

    You have seen how the module system may have disabled a number of modules. You can also check it by running:

    No Format
    module list

    If you want to go back to the previous environment, the recommended way is to reset the environment and then load explicitly all the necessary modules again

    No Format
    module reset
    module load python3



  6. Create a containerised conda installation using Tykky based on the following environment "env.yml" file content including pandas :

    No Format
    name: mypandas
    channels:
      - conda-forge
    dependencies:
      - python=3.12
      - pandas

    Once installed, load the containerised environment and verify if pandas is available by checking its version. 

    Hint: since the installation require resources it is recommended to do it within a ecinteractive session, e.g. requesting 4 CPUs and 10GB RAM 

    Expand
    titleSolution

    Start first an ecinteractive session as recommended:

    No Format
    ecinteractive -c 4 -m 10

    or equivalently for ECS :

    No Format
    ecinteractive -p ecs -c 4 -m 10


    Create the environment yaml file "env.yml" with the provided content:

    No Format
    cat <<EOF >env.yml
    name: mypandas
    channels:
      - conda-forge
    dependencies:
      - python=3.12
      - pandas
    EOF


    then you could load the tykky module:

    No Format
    module load tykky


    continue creating the Tykky containerised environment based on the environment yaml:

    No Format
    conda-containerize new --mamba --prefix $TYKKY_PATH/mypandas env.yml


    after creation you can activate the Tykky environment and check pandas version:

    No Format
    tykky activate mypandas
    
    python3 -c "import pandas as pd; print(pd.__version__)"


...