Portable CPU implementation of Wilson, Brillouin and Susskind fermions in lattice QCD
From MaRDI portal
Publication:6155490
DOI10.1016/J.CPC.2022.108555arXiv2112.14640OpenAlexW4297142612MaRDI QIDQ6155490FDOQ6155490
Authors: S. Dürr
Publication date: 5 June 2023
Published in: Computer Physics Communications (Search for Journal in Brave)
Abstract: A modern Fortran implementation of three Dirac operators (Wilson, Brillouin, Susskind) in lattice QCD is presented, based on OpenMP shared-memory parallelization and SIMD pragmas. The main idea is to apply a Dirac operator to vectors simultaneously, to ease the memory bandwidth bottleneck. All index computations are left to the compiler and maximum weight is given to portability and flexibility. The lattice volume, , the number of colors, , and the number of right-hand sides, , are parameters defined at compile time. Several memory layout options are compared. The code performs well on modern many-core architectures (480 Gflop/s, 880 Gflop/s, and 780 Gflop/s with for the three operators in single precision on a 72-core KNL processor, a -core Skylake node yields similar results). Explicit run-time tests with CG/BiCGstab inverters confirm that the memory layout is relevant for the KNL, but less so for the Skylake architecture. The ancillary code distribution contains all routines, including the single, double, and mixed precision Krylov space solvers, to render it self-contained and ready-to-use.
Full work available at URL: https://arxiv.org/abs/2112.14640
Recommendations
- SU(2) lattice gauge theory simulations on Fermi GPUs
- Towards lattice quantum chromodynamics on FPGA devices
- Lattice QCD with Dynamical Wilson Fermions
- scientific article; zbMATH DE number 4199238
- A computational system for lattice QCD with overlap Dirac quarks
- Performance of lattice QCD programs on CP-PACS
Cites Work
- Accelerating scientific computations with mixed precision algorithms
- Methods of conjugate gradients for solving linear systems
- Parallel iterative methods for sparse linear systems
- Bi-CGSTAB: A Fast and Smoothly Converging Variant of Bi-CG for the Solution of Nonsymmetric Linear Systems
- Lattice QCD based on OpenCL
- Multi-mass solvers for lattice QCD on GPUs
- Numerical techniques for lattice QCD in the \(\epsilon \)-regime
Cited In (2)
This page was built for publication: Portable CPU implementation of Wilson, Brillouin and Susskind fermions in lattice QCD
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q6155490)