OpenMP#
For compiling an OpenMP application an OpenMP-capable compiler is needed. For OpenMP support and required flags see Compiler.
To run an OpenMP application, the number of threads has to be specified
explicitly by setting the environment variable OMP_NUM_THREADS
. This
is not done automatically via the batch system, since Slurm is not
OpenMP-aware. If this is not set, the default variable will be used. In
most cases, the default is 1, which means that your code is executed
serially. If you want to use for example 12 threads in the parallel
regions of your program, you can change the environment variable by
export OMP_NUM_THREADS=12
.
For correct resource allocation in Slurm, use --cpus-per-task
to
define the number of OpenMP threads. If your application does not use
OpenMP but other shared-memory parallelization, please consult the
application manual on how to specify number of threads.
OpenMP Pinning#
To reach optimum performance with OpenMP codes, the correct pinning of the OpenMP threads is essential. As nowadays practically all machines are ccNUMA, where incorrect or no pinning can have devastating effects, this is something that should not be ignored. Slurm will not pin OpenMP threads automatically.
A comfortable way to pin your OpenMP threads to processors is by using
likwid-pin
, which is available within the likwid
module on all
clusters. You can start your program run using the following syntax:
likwid-pin -c <cpulist> <executable>
There are various possibilities to specify the CPU list, depending on
the hardware setup and the requirements of your application. A short
summary is available by calling likwid-pin -h
. A more detailed
documentation can be found on the Likwid
GitHub page.
An alternative way of pinning is using OpenMP specific methods, e.g. by
setting $OMP_PLACES=cores
and $OMP_PROC_BIND=spread
. More
information about this is available in the HPC
Wiki.