Towards dense linear algebra for hybrid GPU accelerated manycore systems
DOI10.1016/J.PARCO.2009.12.005zbMATH Open1204.68268OpenAlexW2162322364MaRDI QIDQ991102FDOQ991102
Marc Baboulin, Jack Dongarra, Stanimire Tomov
Publication date: 2 September 2010
Published in: Parallel Computing (Search for Journal in Brave)
Full work available at URL: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.214.5312
parallel algorithmsgraphics processing unitsmulticore processorsdense linear algebrahybrid computing
Parallel numerical computation (65Y05) Numerical linear algebra (65F99) Parallel algorithms in computer science (68W10) Computer system organization (68M99)
Cites Work
- LAPACK Users' Guide
- GEMM-based level 3 BLAS
- Communication-optimal parallel and sequential QR and LU factorizations
- Accuracy and Stability of Numerical Algorithms
- Minimizing Communication in Numerical Linear Algebra
- Using Mixed Precision for Sparse Matrix Computations to Enhance the Performance while Achieving 64-bit Accuracy
- Accelerating the reduction to upper Hessenberg, tridiagonal, and bidiagonal forms through hybrid GPU-based computing
- Out-of-core solution of linear systems on graphics processors
Cited In (29)
- Simulating Low Precision Floating-Point Arithmetic
- Adapting Regularized Low-Rank Models for Parallel Architectures
- Mixed-precision iterative refinement using tensor cores on GPUs to accelerate solution of linear systems
- Direct numerical simulations of turbulent reacting flows with shock waves and stiff chemistry using many-core/GPU acceleration
- ELSI -- an open infrastructure for electronic structure solvers
- A linear algebra method to decompose forms whose length is lower than the number of variables into weighted sum of squares
- GPU-acceleration of the ELPA2 distributed eigensolver for dense symmetric and Hermitian eigenproblems
- ARKODE: a flexible IVP solver infrastructure for one-step methods
- A LAPACK implementation of the dynamic mode decomposition
- Productivity, performance, and portability for computational fluid dynamics applications
- A parallel computing method using blocked format with optimal partitioning for SpMV on GPU
- Quantum circuits synthesis using Householder transformations
- GPU acceleration of all-electron electronic structure theory using localized numeric atom-centered basis functions
- Accelerating the reduction to upper Hessenberg, tridiagonal, and bidiagonal forms through hybrid GPU-based computing
- Exploiting Lower Precision Arithmetic in Solving Symmetric Positive Definite Linear Systems and Least Squares Problems
- A new approach to the lattice Boltzmann method for graphics processing units
- GPU accelerated computation of the isogeometric analysis stiffness matrix
- Randomized GPU Algorithms for the Construction of Hierarchical Matrices from Matrix-Vector Operations
- Using Random Butterfly Transformations to Avoid Pivoting in Sparse Direct Methods
- A new era in scientific computing: domain decomposition methods in hybrid CPU-GPU architectures
- Title not available (Why is that?)
- DG-IMEX method for a two-moment model for radiation transport in the \(\mathcal{O}(v/c)\) limit
- Extending the length and time scales of Gram-Schmidt Lyapunov vector computations
- GPU parameter tuning for tall and skinny dense linear least squares problems
- Computing Least Squares Condition Numbers on Hybrid Multicore/GPU Systems
- A heterogeneous parallel LU factorization algorithm based on a basic column block uniform allocation strategy
- GPU accelerated Newton for Taylor series solutions of polynomial homotopies in multiple double precision
- Direct numerical simulations of reacting flows with detailed chemistry using many-core/GPU acceleration
- GPU-acceleration of stiffness matrix calculation and efficient initialization of EFG meshless methods
Uses Software
Recommendations
- Title not available (Why is that?) π π
- Parallel Algorithms for Dense Linear Algebra Computations π π
- Accelerating iterative linear solvers using multiple graphical processing units π π
- Accelerating GPU Kernels for Dense Linear Algebra π π
- Linear Systems Solvers for Distributed-Memory Machines with GPU Accelerators π π
- Accelerating Numerical Dense Linear Algebra Calculations with GPUs π π
- Dense linear algebra kernels on heterogeneous platforms: Redistribution issues π π
- Heterogenous Acceleration for Linear Algebra in Multi-coprocessor Environments π π
This page was built for publication: Towards dense linear algebra for hybrid GPU accelerated manycore systems
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q991102)