The Use of BLAS3 in Linear Algebra on a Parallel Processor with a Hierarchical Memory
From MaRDI portal
Publication:3769849
DOI10.1137/0908086zbMath0632.65024MaRDI QIDQ3769849
Ulrike Meier, William Jalby, Kyle A. Gallivan
Publication date: 1987
Published in: SIAM Journal on Scientific and Statistical Computing (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1137/0908086
parallel computing; Block algorithms; Gram-Schmidt algorithm; cache management; LU- decomposition; BLAS3; ALLIANT FX/8; small cache size
65Y05: Parallel numerical computation
68P05: Data structures
65F05: Direct numerical methods for linear systems and matrix inversion
65F25: Orthogonalization in numerical linear algebra
Related Items
Block conjugate gradient algorithms for least squares problems, Adaptive blocking in the QR factorization, Vector processing in simplex and interior methods for linear programming, A pseudospectral matrix element method for solution of three-dimensional incompressible flows and its parallel implementation, An overview of parallel algorithms for the singular value and symmetric eigenvalue problems, Block-Cholesky for parallel processing, A locally optimized reordering algorithm and its application to a parallel sparse linear system solver, Fast parallel solution of the Poisson equation on irregular domains, Designing linear algebra algorithms on the IBM 3090 vector multiprocessor with a hierarchical memory system, Gaussian variant of Freivalds' algorithm for efficient and reliable matrix product verification