MAGMA - MaRDI portal

MaRDI QIDQ24666swMATHFDO

Official website http://icl.cs.utk.edu/magma/

Cited in

(only showing first 100 items - show all)

Parallel direct methods for solving the system of linear equations with pipelining on a multicore using OpenMP
Sparse matrix-vector multiplication on GPGPUs
H2Opus
SGEMM
Hiding global communication latency in the GMRES algorithm on massively parallel machines
The parallel tiled WZ factorization algorithm for multicore architectures
An efficient multicore implementation of a novel HSS-structured multifrontal solver using randomized sampling
Accelerating the solution of linear systems by iterative refinement in three precisions
Solving a large scale radiosity problem on GPU-based parallel computers
High-order finite-element seismic wave propagation modeling with MPI on a large GPU cluster
Redesigning triangular dense matrix computations on GPUs
LU factorization on heterogeneous systems: an energy-efficient approach towards high performance
Hybrid algorithms for solving the algebraic eigenvalue problem with sparse matrices
Numerical analysis of parallel implementation of the reorthogonalized ABS methods
Design of a multicore sparse Cholesky factorization using DAGs
Experiments with sparse Cholesky using a sequential task-flow implementation
Solving a large-scale thermal radiation problem using an interoperable executive library framework on petascale supercomputers
Particle filtering: the need for speed
PLASMA: Parallel linear algebra software for multicore using OpenMP
Programming the finite element method
A Scalable High Performant Cholesky Factorization for Multicore with GPU Accelerators
Multi-GPU implementation of the lattice Boltzmann method
H2Opus: a distributed-memory multi-GPU software package for non-local operators
Implementing High-performance Complex Matrix Multiplication via the 3m and 4m Methods
Interoperable executive library for the simulation of biomedical processes
Scientific computations on multi-core systems using different programming frameworks
Evaluation of selected resource allocation and scheduling methods in heterogeneous many-core processors and graphics processing units
ViennaCL-linear algebra library for multi- and many-core architectures
Efficient determination of the Markovian time-evolution towards a steady-state of a complex open quantum system
BLIS: a framework for rapidly instantiating BLAS functionality
Divide and conquer on hybrid GPU-accelerated multicore systems
A parallel algorithm for calculation of determinants and minors using arbitrary precision arithmetic
Parallel hierarchical hybrid linear solvers for emerging computing platforms
Exploiting symmetry in tensors for high performance: multiplication with symmetric tensors
An inertia-free filter line-search algorithm for large-scale nonlinear programming
An efficient approach to solve very large dense linear systems with verified computing on clusters.
Algorithm 953: Parallel library software for the multishift QR algorithm with aggressive early deflation
SLATE
H-LU factorization on many-core systems
Accelerating the reduction to upper Hessenberg, tridiagonal, and bidiagonal forms through hybrid GPU-based computing
Computing least squares condition numbers on hybrid multicore/GPU systems
Accelerating GPU kernels for dense linear algebra
A parallel auxiliary grid algebraic multigrid method for graphic processing units
KBLAS: an optimized library for dense matrix-vector multiplication on GPU accelerators
FLAME
PaStiX
ScaLAPACK
VOLSCAT
LAWRA
PLAPACK
Algorithm 826
CALU
RScaLAPACK
CUBLAS
MKL
Elemental
POOCLAPACK
OpenCL
SOLAR
Cellss
SBR Toolbox
PLASMA
clSpMV
Algorithm 880
CULA
LogGOPSim
NaSt3DGPF
MR3-SMP
PLASMA
HSL_MA87
IEL
STREAM benchmark
HSL_MA79
QUARK
SpGEMM
StarPU
FastFlow
CUMP
GPUprec
MPIGMP
SWARM
Wool
MINMOD
Algorithm 953
KBLAS
Tcmalloc
yaSpMV
DDSCAT
PBLAS
Algorithm 656
CLBlast
CLTune
DAGuE
SuperMatrix
NFM-DS
AdELL
BiELL
CoAdELL
CSR5
moderngpu

This page was built for software: MAGMA