A set of level 3 basic linear algebra subprograms

DOI10.1145/77626.79170zbMath0900.65115OpenAlexW2002257715WikidataQ56455009 ScholiaQ56455009MaRDI QIDQ4371637

Jeremy J. du Croz, Sven J. Hammarling, Jack J. Dongarra, Iain S. Duff

Publication date: 23 March 1998

Published in: ACM Transactions on Mathematical Software (Search for Journal in Brave)

Full work available at URL: http://www.acm.org/pubs/contents/journals/toms/1990-16/

zbMATH Keywords

verification robustness reliability efficiency testing portability certification matrix-matrix operations

Mathematics Subject Classification ID

Complexity and performance of numerical algorithms (65Y20)

Related Items (only showing first 100 items - show all)

A parallel R-matrix program PRMAT for electron-atom and electron-ion scattering calculations ⋮ Towards an efficient use of the BLAS library for multilinear tensor contractions ⋮ Object-oriented programming in control system design: A survey ⋮ A new parallel sparse direct solver: Presentation and numerical experiments in large-scale structural mechanics parallel computing ⋮ Interior-point solver for large-scale quadratic programming problems with bound constraints ⋮ PROFIL/BIAS - A fast interval library ⋮ Sparse Matrix Methods for Circuit Simulation Problems ⋮ Unnamed Item ⋮ An implicitly restarted block Lanczos bidiagonalization method using Leja shifts ⋮ Stabilizing canonical-ensemble calculations in the auxiliary-field Monte Carlo method ⋮ Performance models and workload distribution algorithms for optimizing a hybrid CPU-GPU multifrontal solver ⋮ Basis selection in LOBPCG ⋮ The design of a parallel dense linear algebra software library: Reduction to Hessenberg, tridiagonal, and bidiagonal form ⋮ Nonlinear eigenvalue and frequency response problems in industrial practice ⋮ A block representation for products of hyperbolic Householder transforms ⋮ Parallel benchmarks of turbulence in complex geometries ⋮ Explicit parallel block Cholesky algorithms on the CRAY APP ⋮ High performance solution of partial differential equations discretized using a Chebyshev spectral collocation method ⋮ Parallel solution of almost block diagonal systems on a hypercube ⋮ An efficient approach to solve very large dense linear systems with verified computing on clusters ⋮ Sparse matrix factorization in the implicit finite element method on petascale architecture ⋮ Efficient algorithm for proper orthogonal decomposition of block-structured adaptively refined numerical simulations ⋮ A sparse nonsymmetric eigensolver for distributed memory architectures ⋮ Optimal size of the block in block GMRES on GPUs: computational model and experiments ⋮ Factorizing the factorization -- a spectral-element solver for elliptic equations with linear operation count ⋮ Reorthogonalized block classical Gram-Schmidt ⋮ Computer algebra systems - new strategies and techniques ⋮ Full multi grid method for electric field computation in point-to-plane streamer discharge in air at atmospheric pressure ⋮ A \(\mu\)-mode BLAS approach for multidimensional tensor-structured problems ⋮ Enhancing Performance and Robustness of ILU Preconditioners by Blocking and Selective Transposition ⋮ A highly efficient implementation of a backpropagation learning algorithm using matrix ISA ⋮ Efficient algorithms for the discrete Gabor transform with a long FIR window ⋮ Codes for almost block diagonal systems ⋮ Rank-profile revealing Gaussian elimination and the CUP matrix decomposition ⋮ Upper and lower I/O bounds for pebbling \(r\)-pyramids ⋮ A multiscale method for model order reduction in PDE parameter estimation ⋮ A domain-decomposing parallel sparse linear system solver ⋮ Fast interval matrix multiplication ⋮ Sparse direct factorizations through unassembled hyper-matrices ⋮ Look-ahead in the two-sided reduction to compact band forms for symmetric eigenvalue problems and the SVD ⋮ Unnamed Item ⋮ Deriving dense linear algebra libraries ⋮ Solving sequences of generalized least-squares problems on multi-threaded architectures ⋮ Cholesky and Gram-Schmidt Orthogonalization for Tall-and-Skinny QR Factorizations on Graphics Processors ⋮ An efficient out-of-core multifrontal solver for large-scale unsymmetric element problems ⋮ Approximate eigenvectors as preconditioner ⋮ New parallel sparse direct solvers for multicore architectures ⋮ Parallel implementation of a multilevel modelling package ⋮ Performance evaluation of supercomputers using HPCC and IMB benchmarks ⋮ Block-Cholesky for parallel processing ⋮ Solving large dense systems of linear equations on systems with virtual memory and with cache ⋮ A sparse proximal implementation of the LP dual active set algorithm ⋮ Dual multilevel optimization ⋮ Comparisons of Gaussian elimination algorithms on a Cray Y-MP ⋮ Augmented block Householder Arnoldi method ⋮ Parallel solution of almost block diagonal systems on the CRAY Y-MP using level 3 BLAS ⋮ High performance BLAS formulation of the multipole-to-local operator in the fast multipole method ⋮ VBARMS: a variable block algebraic recursive multilevel solver for sparse linear systems ⋮ Fast inclusion of interval matrix multiplication ⋮ From steady solutions to chaotic flows in a Rayleigh-Bénard problem at moderate Rayleigh numbers ⋮ Efficient use of sparsity by direct solvers applied to 3D controlled-source EM problems ⋮ Lattice quantum hadrodynamics on a CRAY Y-MP ⋮ Performance of parallel Cholesky factorization algorithms using BLAS ⋮ A massively-parallel electronic-structure calculations based on real-space density functional theory ⋮ Efficient iterative algorithms for the stochastic finite element method with application to acoustic scattering ⋮ Accelerating scientific computations with mixed precision algorithms ⋮ High performance BLAS formulation of the adaptive fast multipole method ⋮ Solving stable Sylvester equations via rational iterative schemes ⋮ A mathematical model of the static pantograph/catenary interaction ⋮ Diffusion forecasting model with basis functions from QR-decomposition ⋮ Solving path problems on the GPU ⋮ RECSY and SCASY Library Software: Recursive Blocked and Parallel Algorithms for Sylvester-Type Matrix Equations with Some Applications ⋮ LAPACK-Based Condition Estimates for the Discrete-Time LQG Design ⋮ Reproducibility strategies for parallel preconditioned conjugate gradient ⋮ Multifrontal Computations on GPUs and Their Multi-core Hosts ⋮ The parallel tiled WZ factorization algorithm for multicore architectures ⋮ Using dual techniques to derive componentwise and mixed condition numbers for a linear function of a linear least squares solution ⋮ Efficient algorithm for simultaneous reduction to the \(m\)-Hessenberg-triangular-triangular form ⋮ BLIS: A Framework for Rapidly Instantiating BLAS Functionality ⋮ Reliable Generation of High-Performance Matrix Algebra ⋮ Block reduction of matrices to condensed forms for eigenvalue computations ⋮ Designing linear algebra algorithms on the IBM 3090 vector multiprocessor with a hierarchical memory system ⋮ Self-Stabilizing Prefix Tree Based Overlay Networks ⋮ Gmsh: A 3-D finite element mesh generator with built-in pre- and post-processing facilities ⋮ A parallel Davidson-type algorithm for several eigenvalues ⋮ Multifrontal parallel distributed symmetric and unsymmetric solvers ⋮ ScaLAPACK: A portable linear algebra library for distributed memory computers -- design issues and performance ⋮ High-performance computing -- an overview ⋮ A review of frontal methods for solving linear systems ⋮ Mathematical software: Past, present, and future ⋮ Numerical algorithm delivery mechanisms ⋮ A frontal solver for the 21st century ⋮ Evaluating recursive filters on distributed memory parallel computers ⋮ \(QR\)-like algorithms for eigenvalue problems ⋮ Numerical linear algebra algorithms and software ⋮ The impact of high-performance computing in the solution of linear systems: Trends and problems ⋮ Nodal high-order methods on unstructured grids. I: Time-domain solution of Maxwell's equations ⋮ Unnamed Item ⋮ A block varaint of the GMRES method for unsymmetric linear systems ⋮ STRFLO: A program for time-independent calculations of multiphoton processes in one-electron atomic systems. I: Quasienergy spectra and angular distributions

Uses Software

BLAS

This page was built for publication: A set of level 3 basic linear algebra subprograms