MPI#
This page gives an overview of installed MPI libraries on our clusters.
Running MPI programs interactively on the frontend nodes is not supported.
For interactive testing use an interactive batch job. During working hours, a number of nodes is reserved for (interactive) jobs with a duration < 1 hour.
salloc
/sbatch
option --cpus-per-task
is no longer propagated to srun
In recent Slurm versions >22.05, the value of --cpus-per-task
is no longer
automatically propagated to srun
, leading to errors in the application start.
This value has to be set manually via the variable SRUN_CPUS_PER_TASK
in your batch script:
Overview#
As MPI libraries we support Open MPI, our default MPI, and Intel MPI on our clusters. The following table gives a brief overview about our Open and Intel MPI installations:
MPI | Open MPI | Intel MPI |
---|---|---|
environment module | openmpi/<version>-<compiler> |
intelmpi/<version> |
default | yes | no |
launcher | srun , mpirun |
srun , mpirun |
vendor documentation | Open MPI | Getting Started, Developer Guide, Developer Reference |
Before usage, the corresponding environment module must be loaded.
The module names follow the pattern <mpi>/<version>[-<compiler...>]
:
<mpi>
: the name of the MPI library, i.e. Open or Intel MPI,<version>
: the version of the MPI library,- Optionally
<compiler...>
: the compiler used to compile the library with, compiler version, and possibly some features the library was build with.- The corresponding compiler will automatically be loaded as a dependency.
For using LIKWID with your MPI application see likwid-mpirun
.
Open MPI#
Open MPI is the default MPI.
The openmpi/<version>[-<compiler...>]
modules provide Open MPI.
The modules will automatically load the compiler the library was build with.
The compiler wrappers mpicc
, mpicxx
, and mpif90
will use the compiler Open MPI was build with.
Usage of srun
instead of mpirun
is recommended.
When you use mpirun
it will infer all relevant options from Slurm.
Do not add options that change the number of processes or nodes, like
-n <number_of_processes>
, as this might disable or distort automatic
process binding.
Open MPI is built using Spack, consult the environment module file for build configuration.
Intel MPI#
The intelmpi/<version>
modules provide Intel MPI.
You can influence the compiler to be used by choosing the corresponding MPI compiler wrapper:
GCC:
compiler | gcc |
g++ |
gfortran |
---|---|---|---|
wrapper | mpicc |
mpicxx |
mpif90 |
Intel Classic:
compiler | icc |
icpc |
ifort |
---|---|---|---|
wrapper | mpiicc |
mpiicpc |
mpiifort |
Intel oneAPI:
compiler | icx |
icpx |
ifx |
---|---|---|---|
wrapper | mpiicc -cc=icx |
mpiicpc -cxx=icpx |
mpiifort -fc=ifx |
LLVM:
compiler | clang |
clang++ |
flang |
---|---|---|---|
wrapper | mpicc -cc=clang |
mpicxx -cxx=clang++ |
mpif90 -fc=flang |
To use GCC, LLVM or Intel compilers additionally the corresponding modules have to be loaded, also see see Compiler:
- for GCC:
module load gcc/<version>
- for LLVM:
module load llvm/<version>
, contact hpc-support@fau.de if the module is not available - for Intel oneAPI/Classic:
module load intel/<version>
When you use mpirun
it will infer all relevant options from Slurm.
Do not add options that change the number of processes or nodes, like
-n <number_of_processes>
, as this might disable or distort automatic
process binding.
With mpirun
one process will be started on each allocated CPU in a
block-wise fashion, i.e. the first node is filled completely, followed by
the second node, and so on.
If you want to start fewer processes per node, e.g. because of large
memory requirements, you can specify the
--ntasks-per-node=<number>
option to sbatch
to define the number of
processes per node.
MPI process binding#
It is possible to use process binding to specify the placement of the processes on the architecture. This may increase the speed of your application, but also requires advanced knowledge about the system's architecture. When no options are given, default values are used. This is the recommended setting for most users.
Both srun
and mpirun
will bind the MPI processes automatically in
most cases. Two cases have to be distinguished regarding the binding of
processes:
- Full nodes: all available cores are used by a step.
mpirun
: will bind automaticallysrun
: will bind automatically
- Partially used nodes: some (automatically) allocated cores are not
used by a job step.
mpirun
: will bind automaticallysrun
: will not bind automatically in some cases, add option--cpu-bind=cores
to force binding
Show process binding#
Automatic binding behavior can differ between Open and Intel MPI,
the version of the MPI library, and the Slurm version. The
resulting distribution of processes may also differ between srun
and
mpirun
.
We strongly recommend checking the process binding of your application regularly, especially after changing versions of any of the used libraries. Incorrect process binding can negatively impact the performance of your application.
Print the process binding at runtime:
srun
: add the option--cpu-bind=verbose
- Open MPI's
mpirun
: add option--report-bindings
: - Intel MPI's
mpirun
: set environment variableI_MPI_DEBUG=5
:
More information about process binding can be found in the HPC Wiki.