Hyperbolic diffusion in flux reconstruction: optimisation through kernel fusion within tensor-product elements

From MaRDI portal
Publication:6159601

DOI10.1016/J.CPC.2021.108235arXiv2107.14027OpenAlexW3215537021WikidataQ114192735 ScholiaQ114192735MaRDI QIDQ6159601FDOQ6159601

F. D. Witherden, Rob Watson, W. Trojak

Publication date: 20 June 2023

Published in: Computer Physics Communications (Search for Journal in Brave)

Abstract: Novel methods are presented in this initial study for the fusion of GPU kernels in the artificial compressibility method (ACM), using tensor product elements with constant Jacobians and flux reconstruction. This is made possible through the hyperbolisation of the diffusion terms, which eliminates the expensive algorithmic steps needed to form the viscous stresses. Two fusion approaches are presented, which offer differing levels of parallelism. This is found to be necessary for the change in workload as the order of accuracy of the elements is increased. Several further optimisations of these approaches are demonstrated, including a generation time memory manager which maximises resource usage. The fused kernels are able to achieve 3-4 times speedup, which compares favourably with a theoretical maximum speedup of 4. In three dimensional test cases, the generated fused kernels are found to reduce total runtime by sim25%, and, when compared to the standard ACM formulation, simulations demonstrate that a speedup of 2.3 times can be achieved.


Full work available at URL: https://arxiv.org/abs/2107.14027







Cites Work


Cited In (2)





This page was built for publication: Hyperbolic diffusion in flux reconstruction: optimisation through kernel fusion within tensor-product elements

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6159601)