Heisenbug
heisenbug is a computing cluster of the computational seismology group at LMU. It is an AMD EPYC based machine with 128 cores that can run 256 threads (near) simultaneously. It also has 2 GPGPUs (NVIDIA GeForce RTX 3090), that can be used to run the GPU version of SeisSol. The RTX 3090 belongs to a consumer kind of graphics cards and thus does not perform well with double precision. Therefore, it is preferable to compile SeisSol with single precision.
A module integrating all libraries relevant for SeisSol-GPU is preinstalled at /export/dump/ravil/modulefiles
.
It can be discovered at startup after adding the following to ~/.bashrc
:
module_hpcsdk=/export/dump/ravil/modulefiles
export MODULEPATH=$MODULEPATH:${module_hpcsdk}
It is then loaded with:
module load seissol-env-gcc-11.1.0
Install YATeTo GPU backends (i.e., GemmForge and ChainForge) as shown here.
Then clone SeisSol with:
git clone https://github.com/SeisSol/SeisSol.git
cd SeisSol
git submodule update --init --recursive
To compile the GPU version of SeisSol on heisenbug, use the following cmake options -DDEVICE_ARCH=sm_86 -DHOST_ARCH=hsw -DDEVICE_BACKEND=cuda -DPRECISION=single
.
Use -DCOMMTHREAD=ON
for multiple GPUs, and -DCOMMTHREAD=OFF
for one GPU.
As there is no queuing system on heisenbug, you need to make sure that nobody is running anything on the GPUs.
You can check that by running nvidia-smi
(it should return No running processes found
).
To run on one GPU (here with order 4, elastic), use simply:
export OMP_NUM_THREADS=1
export OMP_PLACES="cores"
export OMP_PROC_BIND=spread
./launch ./SeisSol_RelWithDebInfo_ssm_86_cuda_6_elastic ./parameters.par
launch is a simple bash helper script. It is generated by CMake, in the build directory).
On 2 ranks, use:
export OMP_NUM_THREADS=1
export OMP_PLACES="cores"
export OMP_PROC_BIND=spread
mpirun -n 2 --map-by ppr:1:numa:pe=2 --report-bindings ./launch ./SeisSol_RelWithDebInfo_ssm_86_cuda_6_elastic ./parameters.par