nvidia-smi
nvidia-smi provides monitoring and management capabilities for the GPUs from the command line and will give you instantaneous information about your GPUs.
$ nvidia-smi Wed Mar 8 14:39:45 2023 +-----------------------------------------------------------------------------+ | NVIDIA-SMI 520.61.05 Driver Version: 520.61.05 CUDA Version: 11.8 | |-------------------------------+----------------------+----------------------+ | GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |===============================+======================+======================| | 0 NVIDIA A100-SXM... On | 00000000:03:00.0 Off | 0 | | N/A 62C P0 351W / 400W | 39963MiB / 40960MiB | 93% Default | | | | Disabled | +-------------------------------+----------------------+----------------------+ +-----------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=============================================================================| | 0 N/A N/A 181525 C python 39960MiB | +-----------------------------------------------------------------------------+
This command has a number of advanced command options. If you want to log the usage of the GPUs by your processes in a batch job you could use the following strategy:
nvidia-smi pmon -o DT -d 5 --filename gpu_usage.log & monitor_pid=$! your_gpu_workload goes here kill $monitor_pid
In this example, nvidia-smi will then log into gpu_usage.log the processes using the gpu and their resource usage, every 5 seconds, and adding the date and time on each line for better tracking.
See man nvidia-smi for more information
nvtop
Nvtop stands for Neat Videocard TOP, a (h)top like task monitor for GPUs. It can handle multiple GPUs and print information about them in a htop familiar way. It is useful if you want to interactively monitor the GPU usage and see its evolution live.
See man nvtop for all the options.