MPI#
This page gives an overview of installed MPI libraries on our clusters.
Running MPI programs interactively on the frontend nodes is not supported.
For interactive testing use an interactive batch job. During working hours, a number of nodes is reserved for (interactive) jobs with a duration < 1 hour.
salloc/sbatch option --cpus-per-task is no longer propagated to srun
In recent Slurm versions >22.05, the value of --cpus-per-task is no longer 
automatically propagated to srun, leading to errors in the application start.
This value has to be set manually via the variable SRUN_CPUS_PER_TASK in your batch script:
Overview#
As MPI libraries we support Open MPI, our default MPI, and Intel MPI on our clusters. The following table gives a brief overview about our Open and Intel MPI installations:
| MPI | Open MPI | Intel MPI | 
|---|---|---|
| environment module | openmpi/<version>-<compiler> | intelmpi/<version> | 
| default | yes | no | 
| launcher | srun,mpirun | srun,mpirun | 
| vendor documentation | Open MPI | Getting Started, Developer Guide, Developer Reference | 
Before usage, the corresponding environment module must be loaded.
The module names follow the pattern <mpi>/<version>[-<compiler...>]:
- <mpi>: the name of the MPI library, i.e. Open or Intel MPI,
- <version>: the version of the MPI library,
- Optionally <compiler...>: the compiler used to compile the library with, compiler version, and possibly some features the library was build with.- The corresponding compiler will automatically be loaded as a dependency.
 
For using LIKWID with your MPI application see likwid-mpirun.
Open MPI#
Open MPI is the default MPI.
The openmpi/<version>[-<compiler...>] modules provide Open MPI.
The modules will automatically load the compiler the library was build with.
The compiler wrappers mpicc, mpicxx, and mpif90 will use the compiler Open MPI was build with.
Usage of srun instead of mpirun is recommended.
When you use mpirun it will infer all relevant options from Slurm.
Do not add options that change the number of processes or nodes, like
-n <number_of_processes>, as this might disable or distort automatic
process binding.
Open MPI is built using Spack, consult the environment module file for build configuration.
Intel MPI#
The intelmpi/<version> modules provide Intel MPI.
You can influence the compiler to be used by choosing the corresponding MPI compiler wrapper:
GCC:
| compiler | gcc | g++ | gfortran | 
|---|---|---|---|
| wrapper | mpicc | mpicxx | mpif90 | 
Intel Classic:
| compiler | icc | icpc | ifort | 
|---|---|---|---|
| wrapper | mpiicc | mpiicpc | mpiifort | 
Intel oneAPI:
| compiler | icx | icpx | ifx | 
|---|---|---|---|
| wrapper | mpiicc -cc=icx | mpiicpc -cxx=icpx | mpiifort -fc=ifx | 
LLVM:
| compiler | clang | clang++ | flang | 
|---|---|---|---|
| wrapper | mpicc -cc=clang | mpicxx -cxx=clang++ | mpif90 -fc=flang | 
To use GCC, LLVM or Intel compilers additionally the corresponding modules have to be loaded, also see see Compiler:
- for GCC: module load gcc/<version>
- for LLVM: module load llvm/<version>, contact hpc-support@fau.de if the module is not available
- for Intel oneAPI/Classic: module load intel/<version>
When you use mpirun it will infer all relevant options from Slurm.
Do not add options that change the number of processes or nodes, like
-n <number_of_processes>, as this might disable or distort automatic
process binding.
With mpirun one process will be started on each allocated CPU in a
block-wise fashion, i.e. the first node is filled completely, followed by
the second node, and so on. 
If you want to start fewer processes per node, e.g. because of large
memory requirements, you can specify the
--ntasks-per-node=<number> option to sbatch to define the number of
processes per node.
MPI process binding#
It is possible to use process binding to specify the placement of the processes on the architecture. This may increase the speed of your application, but also requires advanced knowledge about the system's architecture. When no options are given, default values are used. This is the recommended setting for most users.
Both srun and mpirun will bind the MPI processes automatically in
most cases. Two cases have to be distinguished regarding the binding of
processes:
- Full nodes: all available cores are used by a  step.- mpirun: will bind automatically
- srun: will bind automatically
 
- Partially used nodes: some (automatically) allocated cores are not
  used by a job step.- mpirun: will bind automatically
- srun: will not bind automatically in some cases, add option- --cpu-bind=coresto force binding
 
Show process binding#
Automatic binding behavior can differ between Open and Intel MPI,
the version of the MPI library, and the Slurm version. The
resulting distribution of processes may also differ between srun and
mpirun.
We strongly recommend checking the process binding of your application regularly, especially after changing versions of any of the used libraries. Incorrect process binding can negatively impact the performance of your application.
Print the process binding at runtime:
- srun: add the option- --cpu-bind=verbose
- Open MPI's mpirun: add option--report-bindings:
- Intel MPI's mpirun: set environment variableI_MPI_DEBUG=5:
More information about process binding can be found in the HPC Wiki.