Multi-GPU implementation of a time-explicit finite volume solver using CUDA and a CUDA-aware version of OpenMPI with application to shallow water flows

From MaRDI portal
Publication:6156956

DOI10.1016/J.CPC.2021.108190arXiv2010.14416OpenAlexW3206983896MaRDI QIDQ6156956FDOQ6156956


Authors: Vincent Delmas, Azzeddine Soulaimani Edit this on Wikidata


Publication date: 19 June 2023

Published in: Computer Physics Communications (Search for Journal in Brave)

Abstract: This paper shows the development of a multi-GPU version of a time-explicit finite volume solver for the Shallow-Water Equations (SWE) on a multi-GPU architecture. MPI is combined with CUDA-Fortran in order to use as many GPUs as needed. The METIS library is leveraged to perform a domain decomposition on the 2D unstructured triangular meshes of interest. A CUDA-Aware OpenMPI version is adopted to speed up the messages between the MPI processes. A study of both speed-up and efficiency is conducted; first, for a classic dam-break flow in a canal, and then for two real domains with complex bathymetries: the Mille ^Iles river and the Montreal archipelago. In both cases, meshes with up to 13 million cells are used. Using 24 to 28 GPUs on these meshes leads to an efficiency of 80% and more. Finally, the multi-GPU version is compared to the pure MPI multi-CPU version, and it is concluded that in this particular case, about 100 CPU cores would be needed to achieve the same performance as one GPU.


Full work available at URL: https://arxiv.org/abs/2010.14416







Cites Work


Cited In (6)





This page was built for publication: Multi-GPU implementation of a time-explicit finite volume solver using CUDA and a CUDA-aware version of OpenMPI with application to shallow water flows

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6156956)