For those doing AI/ML on the European Weather Cloud, we have published a number of templates to help you set up the necessary software stacks to run your favourite workloads. You may apply those onto an existing instance, and combine them as needed. This is part of the EWC Automation templates.

The stacks currently part of this collection are:

  • ML Basic: provides a conda environment with the basic AI/ML tools in python such as torch, tensorflow, keras, scikit-learn, and others.
  • AI Models: sets up a conda environment with the AI-models package, which allows you to run popular data-driven weather forecasting models such as panguweather or graphcast.
  • Anemoi: installs a conda environment featuring all the Anemoi components. It includes the basic packages such as datasets, training, graphs, models or inference.
  • AIFS Single MSE: installs the ECMWF AIFS Single MSE Data-Driven Forecasting system and supporting dependencies.
  • AIFS ENS CRPS: installs the ECMWF AIFS ENS CRPS Data-Driven Forecasting system and supporting dependencies.

These stacks are based on Ansible playbooks and can be found in https://github.com/ewcloud/ewc-ecmwf-ai-stacks.

Disk space requirements

AI/ML software stacks usually require a significant amount of disk space. Make sure your instances have enough free space before applying them. Although it is possible that less space is required based on the specific setup and combinations, we recommend at least 15 GB for each of those you may want to apply. You may extend your instance volume volume or add a new one where to install those if needed. If a new volume is added, make sure 

How to apply an AI stack to an existing instance via Morpheus

If using ECMWF's Morpheus portal, you may find those are offered as Morpheus Workflows by default:

Manually adding automation

If you cannot see them, you may also add them to your tenancy by creating a new Git integration of the https://github.com/ewcloud/ewc-ecmwf-ai-stacks as described in the Morpheus documentation, and then creating the Automation Tasks and Workflows required using that git integration, which is described here.

  1. Go to Provisioning - Instances, and select the desired instance to apply the stack. If you don't have one, create one first.
  2. Click Actions - Run Workflow and type the name of the desired one. Some stacks may offer the possibility of customising certain parameters, such as the location of the conda installation. You may tune them if needed, for example if you are using an additional volume with more capacity for your software stack. You can also configure the ansible command options (for example, passing a -v for a more verbose output).
  3. Click Execute
  4. You may follow the progress of the workflow execution under History.

How to apply an AI stack to an existing instance via Ansible

You can also install those directly with Ansible, for example as part of your existing Infrastructure as Code (IaC) or CI/CD pipelines.

You may run Ansible on the same instance or anywhere else from where you can connect to your instance via SSH, such as another instance in the same private network, or from your own computer if your instance can be reached over SSH. We will refer to this as your seed platform. You will need at least git and python available to follow these steps.

  1. On your seed platform, clone the repository and cd into the directory:
    git clone <repository-url>
    cd <repository>
  2. If you don't have Ansible installed, you may install it with pip:
    pip install --user -r requirements.txt
    or if you prefer to do it in a virtual environment:
    python3 -m venv ansible-venv
    source ansible-venv/bin/activate
    pip install -r requirements.txt
  3. Install the necessary Ansible roles that are going to be used by the playbooks:
    ansible-galaxy role install -r requirements.yml roles/
  4. If you don't have it already, define your Ansible inventory. The simplest approach would be to create a file called inventory, on the same directory where the plabooks are, containing the fully qualified domain name (FQDN) or IP address used to connect to the instance from your seed platform. If running on the same instance, you may use localhost.
  5. Apply the desired playbook with ansible-playbook:
    ansible-playbook -i inventory playbookname.yml
    You may pass additional options to ansible-playbook, such as:
    • -v for verbose output
    • -K for asking sudo password, if your user does not have password-less sudo privileges on the target instance.
    • -u yourremoteuser if Ansible needs to use a specific user account to connect to the target instance.
    • -e var=value for add-hoc customisation of playbook variables to customise your installation.

Further customisation

Check each specific role documentation in the URLs found in requirements.yml to see all the variables you may customise when running the automation. For example, for the Anemoi role: https://github.com/ewcloud/ewc-ansible-role-anemoi

Advanced: How to use your custom Ansible playbook using provided Ansible roles

For advanced users with existing Ansible playbooks and greater customisation needs, it is possible to just pick and choose roles you are interested and include them into your own playbooks, instead of running separate playbooks. You can find the URLs for each of the dependant roles in the requirements.yml file. 

  1. Open your playbook file and include the roles that you are interested in. For example, here is a playbook that includes 3 roles and customises some variable:
    myplaybook.yml
    ---
    - hosts: all
      become: yes
      vars:
        conda_prefix: /opt/conda
      tasks:
        - name: Mars client
          ansible.builtin.include_role:
            name: ewc-ansible-role-mars-client
     
        - name: ML basic stack
          ansible.builtin.include_role:
            name: ewc-ansible-role-ml-basic
     
        - name: Anemoi
          ansible.builtin.include_role:
            name: ewc-ansible-role-ecmwf-anemoi
    
    

    See the official Ansible documentation for more information on roles and how to include them. 
  2. Make sure the necessary roles are available before you run the playbook. For that you may install them with ansible-galaxy either individually or writing your own requirements.yml. For the example above, this could be a requirements.yaml file:
    requirements.yml
    ---
    # Ansible Requirements
    roles:
      - name: ewc-ansible-role-mars-client
        src: https://github.com/ewcloud/ewc-ansible-role-mars-client.git
        version: main
    
      - name: ewc-ansible-role-conda
        src: https://github.com/ewcloud/ewc-ansible-role-conda.git
        version: main
    
      - name: ewc-ansible-role-ml-basic
        src: https://github.com/ewcloud/ewc-ansible-role-ml-basic.git
        version: main
    
      - name: ewc-ansible-role-anemoi
        src: https://github.com/ewcloud/ewc-ansible-role-anemoi.git
        version: main

    See the official Ansible documentation for more details on how to create such file.
  3. Install the roles with ansible-galaxy:
    ansible-galaxy role install -r requirements.yml roles/
  4. Run the playbook
    $ ansible-playbook -i inventory myplaybook.yml

Further customisation

Check each specific role documentation in the URLs above to see all the variables you may customise when running the automation. 


  • No labels