MPI calculations with Cerberus
Cerberus can be used to run certain solvers MPI parallelized across multiple computing nodes.
Required installation
Cerberus uses the mpi4py Python-package for MPI functionality. It can be installed with pip. Note, mpi4py 3.1 onwards sometimes fails to install with No module named 'glob'. You can try mpi4py 3.0.3 in that case.
Notice, however that mpi4py will get compiled with whichever MPI-libraries you have available during the pip installation, which means that the pip installation needs to be run in an environment, where the correct MPI-libraries have been loaded.
If you need to re-install mpi4py for whatever reason, you may want to use --no-cache-dir with pip in order to not just get the same version again.
Running calculations
The MPI libraries used to build both the solver modules and the mpi4py package for Cerberus need to be loaded.
Solvers that are to use MPI parallelization, should be initialized with e.g.
Solver.initialize(use_MPI=True, n_MPI_procs_to_spawn=4)
Cerberus can then be run (on SLURM) with a setup similar to the one described below
#!/bin/bash #SBATCH --cpus-per-task=20 #SBATCH --ntasks=4 #SBATCH --partition=core40 #SBATCH -o output.txt module load mpi/openmpi-x86_64 mpirun --report-bindings -np 1 --bind-to none -oversubscribe python > cerberus_output.txt
The idea here is the following:
- Cerberus itself does not use any resources while a Solver is running, so we can -oversubscribe and let the MPI parallelized solver also use the resources allocated to Cerberus.
- To successfully use -oversubscribe the binding of the tasks to sockets or cores needs to be disabled with --bind-to none.
- Only one Cerberus task is started (-np 1).
- In the future, a separate --host <hostname> argument will be added to Solvers to allow them to connect to Cerberus across nodes.
- Socket communication across nodes is significantly slower than on the same node.