...
To use the GPUs:
- Provision new Centos or Ubuntu instance.
Image Removed
Image Added - Select layout ending with
eumetsat
-gpu and one of the plans listed above. Beside that, configure your instance as preferred and continue deployment process.
![](/download/attachments/266586994/Screenshot%202024-01-08%20at%2012.17.43.png?version=1&modificationDate=1704712670457&api=v2)
- Once VM is deployed, you can verify GPUs for example using
nvidia-smi
program from command line (see below for confirming library installations and drivers).
...
Code Block |
---|
language | bash |
---|
title | Checking the GPU drivers |
---|
collapse | true |
---|
|
# Login to your instance and run below command
$ nvidia-smi
# Check if the input you received shows the NVIDIA-SMI, Driver and CUDA versions. You can also see the GPU hardware (e.g., RTXA6000-6C) and the GPU memory
Mon Feb 5 13:01:43 2024
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.223.02 Driver Version: 470.223.02 CUDA Version: 11.4 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA RTXA6000-6C On | 00000000:00:05.0 Off | 0 |
| N/A N/A P8 N/A / N/A | 512MiB / 5976MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+ |
...
Code Block |
---|
language | bash |
---|
title | Conda installation |
---|
|
# install miniforge (or any anacondaconda manager)
$ wget https://github.com/conda-forge/miniforge/releases/latest/download/Miniforge3-Linux-x86_64.sh
# make it executable
$ chmod +x Miniforge3-Linux-x86_64.sh
# run and install the executable
$ ./Miniforge3-Linux-x86_64.sh |
...
Code Block |
---|
language | bash |
---|
title | Install docker on CentOS |
---|
|
$ sudo yum-config-manager \
--add-repo \
https://download.docker.com/linux/centos/docker-ce.repo
$ sudo yum install docker-ce docker-ce-cli containerd.io
$ sudo systemctl --now enable docker
$ sudo usermod -aG docker $USER |
To provide support for docker to use the GPU, you need to install the NVIDIA Container Toolkit. You can follow instructions on NVIDIA's website or basically do:
Code Block |
---|
language | bash |
---|
title | Install necassery necessary packages for GPU support in Docker and restart docker on Ubuntu |
---|
|
$ distribution=$(. /etc/os-release;echo $ID$VERSION_ID)
$ curl -s -L https://nvidia.github.io/nvidia-docker/gpgkey | sudo apt-key add -
$ curl -s -L https://nvidia.github.io/nvidia-docker/$distribution/nvidia-docker.list | sudo tee /etc/apt/sources.list.d/nvidia-docker.list
$ sudo apt-get update && sudo apt-get install -y nvidia-container-toolkit
$ sudo systemctl restart docker |
...
Code Block |
---|
language | bash |
---|
title | Install necassery necessary packages and restart docker on CentOS |
---|
|
$ distribution=$(. /etc/os-release;echo $ID$VERSION_ID) \
&& curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.repo | sudo tee /etc/yum.repos.d/nvidia-container-toolkit.repo
$ sudo yum clean expire-cache && sudo yum install -y nvidia-docker2
$ sudo systemctl restart docker |
Test the install with:
Code Block |
---|
title | nvidia-smi in docker test |
---|
|
$ docker run --rm --gpus all nvidia/cuda:11.0.3-base-ubuntu20.04 nvidia-smi
Wed Feb 28 13:20:24 2024
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.223.02 Driver Version: 470.223.02 CUDA Version: 11.4 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA RTXA6000-6C On | 00000000:00:05.0 Off | 0 |
| N/A N/A P8 N/A / N/A | 512MiB / 5976MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+
|
And run something useful..
Code Block |
---|
language | bash |
---|
title | Run tensorflow JupyterNotebooks |
---|
|
$ sudo docker run --gpus all --env NVIDIA_DISABLE_REQUIRE=1 -it --rm -v $(realpath ~/notebooks):/tf/notebooks -p 8888:8888 tensorflow/tensorflow:latest-gpu-jupyter |
...