MAGMA
From MaRDI portal
Software:24666
swMATH12741MaRDI QIDQ24666FDOQ24666
Author name not available (Why is that?)
Cited In (48)
- Programming the finite element method
- Sparse Matrix-Vector Multiplication on GPGPUs
- A Scalable High Performant Cholesky Factorization for Multicore with GPU Accelerators
- H2Opus: a distributed-memory multi-GPU software package for non-local operators
- KBLAS
- Implementing High-performance Complex Matrix Multiplication via the 3m and 4m Methods
- Multi-GPU implementation of the lattice Boltzmann method
- Interoperable executive library for the simulation of biomedical processes
- Scientific computations on multi-core systems using different programming frameworks
- Evaluation of selected resource allocation and scheduling methods in heterogeneous many-core processors and graphics processing units
- ViennaCL-linear algebra library for multi- and many-core architectures
- BLIS: a framework for rapidly instantiating BLAS functionality
- Efficient determination of the Markovian time-evolution towards a steady-state of a complex open quantum system
- Divide and conquer on hybrid GPU-accelerated multicore systems
- Parallel hierarchical hybrid linear solvers for emerging computing platforms
- A parallel algorithm for calculation of determinants and minors using arbitrary precision arithmetic
- A High Performance QDWH-SVD Solver Using Hardware Accelerators
- Solving a Large-Scale Thermal Radiation Problem Using an Interoperable Executive Library Framework on Petascale Supercomputers
- An inertia-free filter line-search algorithm for large-scale nonlinear programming
- Accelerating the Solution of Linear Systems by Iterative Refinement in Three Precisions
- Hiding Global Communication Latency in the GMRES Algorithm on Massively Parallel Machines
- Algorithm 953: Parallel library software for the multishift QR algorithm with aggressive early deflation
- \(\mathcal H\)-LU factorization on many-core systems
- Accelerating the reduction to upper Hessenberg, tridiagonal, and bidiagonal forms through hybrid GPU-based computing
- A parallel auxiliary grid algebraic multigrid method for graphic processing units
- Exploiting Symmetry in Tensors for High Performance: Multiplication with Symmetric Tensors
- A Set of Batched Basic Linear Algebra Subprograms and LAPACK Routines
- A distributed and incremental SVD algorithm for agglomerative data analysis on large networks
- Hierarchical algorithms on hierarchical architectures
- PLASMA
- Design of a Multicore Sparse Cholesky Factorization Using DAGs
- An efficient approach to solve very large dense linear systems with verified computing on clusters
- Exact likelihood-free Markov chain Monte Carlo for elliptically contoured distributions
- Implementing Multifrontal Sparse Solvers for Multicore Architectures with Sequential Task Flow Runtime Systems
- Extending the length and time scales of Gram-Schmidt Lyapunov vector computations
- Parallel direct methods for solving the system of linear equations with pipelining on a multicore using OpenMP
- Accelerating GPU Kernels for Dense Linear Algebra
- Computing Least Squares Condition Numbers on Hybrid Multicore/GPU Systems
- An efficient multicore implementation of a novel HSS-structured multifrontal solver using randomized sampling
- The parallel tiled WZ factorization algorithm for multicore architectures
- Solving a large scale radiosity problem on GPU-based parallel computers
- High-order finite-element seismic wave propagation modeling with MPI on a large GPU cluster
- Redesigning triangular dense matrix computations on GPUs
- LU factorization on heterogeneous systems: an energy-efficient approach towards high performance
- Numerical analysis of parallel implementation of the reorthogonalized ABS methods
- Hybrid algorithms for solving the algebraic eigenvalue problem with sparse matrices
- Experiments with sparse Cholesky using a sequential task-flow implementation
- Particle filtering: the need for speed
This page was built for software: MAGMA