Linear systems solvers for distributed-memory machines with GPU accelerators
From MaRDI portal
Recommendations
- Accelerating preconditioned iterative linear solvers on GPU
- Performance models and workload distribution algorithms for optimizing a hybrid CPU-GPU multifrontal solver
- Effective minimally-invasive GPU acceleration of distributed sparse matrix factorization
- Exposing fine-grained parallelism in algebraic multigrid methods
- A data-parallel ILUPACK for sparse general and symmetric indefinite linear systems
Cites work
- scientific article; zbMATH DE number 2089172 (Why is no real title available?)
- A high performance QDWH-SVD solver using hardware accelerators
- A recursive formulation of Cholesky factorization of a matrix in packed storage
- Parallel and cache-efficient in-place matrix storage format conversion
- ScaLAPACK Users' Guide
- Scaling LAPACK panel operations using parallel cache assignment
Cited in
(6)- A condensation-based application of Cramer's rule for solving large-scale linear systems
- Considerations on the Implementation and Use of Anderson Acceleration on Distributed Memory and GPU-based Parallel Computers
- Performance models and workload distribution algorithms for optimizing a hybrid CPU-GPU multifrontal solver
- A flexible CUDA LU-based solver for small, batched linear systems
- Towards dense linear algebra for hybrid GPU accelerated manycore systems
- An error correction solver for linear systems: evaluation of mixed precision implementations
This page was built for publication: Linear systems solvers for distributed-memory machines with GPU accelerators
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q3297578)