Parallel matrix multiplication: a systematic journey
From MaRDI portal
Recommendations
- Communication lower bounds for distributed-memory matrix multiplication
- Parallel complexity of matrix multiplication
- Computational Science - ICCS 2004
- Scalable parallel matrix multiplication on distributed memory parallel computers
- A new parallel matrix multiplication algorithm on distributed-memory concurrent computers
Cites work
- scientific article; zbMATH DE number 6118211 (Why is no real title available?)
- An inequality related to the isoperimetric inequality
- Anatomy of high-performance matrix multiplication
- Communication and matrix computations on large message passing systems
- Communication lower bounds and optimal algorithms for numerical linear algebra
- Communication lower bounds for distributed-memory matrix multiplication
- Elemental, a new framework for distributed memory dense matrix computations
- FLAME
- LAPACK Users' Guide
- Memory-efficient matrix multiplication in the BSP model
- Programming matrix algorithms-by-blocks for thread-level parallelism
- The Torus-Wrap Mapping for Dense Matrix Calculations on Massively Parallel Computers
Cited in
(15)- Parallelization of the exponential time differencing method for solving diffuse-interface models
- Massively parallel sparse matrix function calculations with NTPoly
- Exploiting multiple levels of parallelism in sparse matrix-matrix multiplication
- Parallel complexity of matrix multiplication
- Scalable parallel matrix multiplication on distributed memory parallel computers
- scientific article; zbMATH DE number 1275860 (Why is no real title available?)
- scientific article; zbMATH DE number 697776 (Why is no real title available?)
- A flexible multicomputer algorithm for elementary matrix operations
- The problem of small and large matrices in parallel Matrix Multiplication
- Task-based parallel programming for scalable matrix product algorithms
- Computational Science - ICCS 2004
- Elemental, a new framework for distributed memory dense matrix computations
- Communication lower bounds for distributed-memory matrix multiplication
- A test for the absence of aliasing or white noise in two-dimensional locally stationary wavelet processes
- A distributed block Chebyshev-Davidson algorithm for parallel spectral clustering
This page was built for publication: Parallel matrix multiplication: a systematic journey
Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q2954477)