Difference between revisions of "MPI calculations with Cerberus"

Revision as of 09:21, 2 June 2022

Cerberus can be used to run certain solvers MPI parallelized across multiple computing nodes.

Required installation

Cerberus uses the mpi4py Python-package for MPI functionality. It can be installed with pip.

Notice, however that mpi4py will get compiled with whichever MPI-libraries you have available during the pip installation, which means that it is best to run the pip installation in an environment, where the correct MPI-libraries have been loaded.

Running calculations

The MPI libraries used to build both the solver modules and the mpi4py package for Cerberus need to be loaded.

Any Solvers that have -mpi or --mpi as their command line argument will be spawned with MPI.Comm.Spawn instead of being initiated as a subprocess.

Cerberus can then be run with a setup similar to the one described below

#!/bin/bash
#SBATCH --cpus-per-task=20
#SBATCH --ntasks=4
#SBATCH --partition=core40
#SBATCH -o output.txt

module load mpi/openmpi-x86_64

mpirun --report-bindings -np 4 --bind-to none -oversubscribe python input.py > cerberus_output.txt

The idea here is the following:

Cerberus itself does not use any resources while a Solver is running, so we can -oversubscribe and let the MPI parallelized solver also use the resources allocated to Cerberus.
To successfully use -oversubscribe the binding of the tasks to sockets or cores needs to be disabled with --bind-to none.
Only one task of Cerberus communicates with only one Solver task.
- All but one Cerberus task exit upon loading the Cerberus package at cerberus.__init__.py.
- It can be somewhat tricky to get the remaining Cerberus task and the communicating Solver task (task 0 with Serpent and SuperFINIX) on the same node.
- --report-bindings with OpenMPI helps the user to analyze, which node each task is spawned on.
- The rank of the task that continues at cerberus.__init__.py can be adjusted if needed.
- In the future, a separate --cerberus-host <hostname> argument may be added to Solvers to allow them to connect to Cerberus across nodes.
  - Socket communication across nodes is significantly slower than on the same node.

@@ Line 13: / Line 13: @@
 The MPI libraries used to build both the solver modules and the mpi4py package for Cerberus need to be loaded.
-Cerberus can be run with
+Any Solvers that have '''-mpi''' or '''--mpi''' as their command line argument will be spawned with <tt>MPI.Comm.Spawn</tt> instead of being initiated as a subprocess.
+Cerberus can then be run with a setup similar to the one described below
   <nowiki>#!/bin/bash
@@ Line 23: / Line 25: @@
 module load mpi/openmpi-x86_64
-mpirun --report-bindings -np 4 --bind-to none --map-by ppr:2:node:pe=20 -oversubscribe python input.py > ulos.txt</nowiki>
+mpirun --report-bindings -np 4 --bind-to none -oversubscribe python input.py > cerberus_output.txt</nowiki>
+The idea here is the following:
+*Cerberus itself does not use any resources while a Solver is running, so we can <tt>-oversubscribe</tt> and let the MPI parallelized solver also use the resources allocated to Cerberus.
+*To successfully use <tt>-oversubscribe</tt> the binding of the tasks to sockets or cores needs to be disabled with <tt>--bind-to none</tt>.
+*Only one task of Cerberus communicates with only one Solver task.
+**All but one Cerberus task exit upon loading the Cerberus package at <tt>cerberus.__init__.py</tt>.
+**It can be somewhat tricky to get the remaining Cerberus task and the communicating Solver task (task 0 with Serpent and SuperFINIX) on the same node.
+**<tt>--report-bindings</tt> with OpenMPI helps the user to analyze, which node each task is spawned on.
+**The rank of the task that continues at <tt>cerberus.__init__.py</tt> can be adjusted if needed.
+**In the future, a separate <tt>--cerberus-host <hostname></tt> argument may be added to Solvers to allow them to connect to Cerberus across nodes.
+***Socket communication across nodes is significantly slower than on the same node.

Difference between revisions of "MPI calculations with Cerberus"

Revision as of 09:21, 2 June 2022

Required installation

Running calculations

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools