Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Code Block
languagebash
titleMPI job example
linenumberstrue
collapsetrue
#!/bin/bash
#SBATCH --job-name=test-mpi
#SBATCH --qos=np
#SBATCH --ntasks=512
#SBATCH --time=10:00
#SBATCH --mem-per-cpu=100
#SBATCH --output=test-mpi.%j.out
#SBATCH --error=test-mpi.%j.out
#SBATCH --chdir=/scratch...

srun my_mpi_app

The example above would run a 512 task MPI application

...

Code Block
languagebash
titleMPI Hybrid job example
linenumberstrue
collapsetrue
#!/bin/bash
#SBATCH --job-name=test-hybrid
#SBATCH --qos=np
#SBATCH --ntasks=128
#SBATCH --cpus-per-task=4
#SBATCH --time=10:00
#SBATCH --mem-per-cpu=100
#SBATCH --output=test-hybrid.%j.out
#SBATCH --error=test-hybrid.%j.out
#SBATCH --chdir=/scratch...



# Ensure OpenMP correct pinning
export OMP_NUM_THREADS=${SLURMPLACES=threads

srun -c $SLURM_CPUS_PER_TASK:-1}
srun my_mpi_openmp_app


Tip

See man sbatch or https://slurm.schedmd.com/sbatch.html for the complete set of options that may be used to configure a job.

Heterogeneous job and MPMD

Tip

To spawn an MPI application you must use srun

This example runs a hybrid Multiple Program Multiple Data (MPMD) application, requiring different geometries for different parts of the MPI execution. The job allocates 3 nodes, and then uses the first one to run executable1 with 64 tasks and 2 threads per rank, while the remaining two nodes are used to run the second executable on 64 tasks and 4 threads per rank.

Code Block
languagebash
titleMPMD job example with heterogeneous geometry
linenumberstrue
collapsetrue
#!/bin/bash
#SBATCH --job-name=test-het
#SBATCH --qos=np
#SBATCH --nodes=3
#SBATCH --hint=nomultithread
#SBATCH --time=10:00
#SBATCH --output=test-het.%j.out
#SBATCH --error=test-het.%j.out

# Needed to avoid occasional job hang at exit
export SLURM_MPI_TYPE=none

# Ensure OpenMP correct pinning
export OMP_PLACES=threads

srun -N1 -n 64 -c 2 executable1 : -N2 -n 64 -c 4 executable2


Note

The minimum allocation for each part of the heterogeneous execution is one node.