| Publication | Date of Publication | Type |
|---|
| Generalizing Random Butterfly Transforms to Arbitrary Matrix Sizes | 2023-12-14 | Paper |
| Accelerating Computation of Eigenvectors in the Dense Nonsymmetric Eigenvalue Problem | 2022-12-09 | Paper |
| Heterogenous Acceleration for Linear Algebra in Multi-coprocessor Environments | 2022-12-09 | Paper |
| Mixed-Precision Orthogonalization Scheme and Adaptive Step Size for Improving the Stability and Performance of CA-GMRES on GPUs | 2022-12-09 | Paper |
A Set of Batched Basic Linear Algebra Subprograms and LAPACK Routines ACM Transactions on Mathematical Software | 2022-02-01 | Paper |
Mixed-precision iterative refinement using tensor cores on GPUs to accelerate solution of linear systems Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences | 2021-10-29 | Paper |
Numerical algorithms for high-performance computational science Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences | 2021-06-15 | Paper |
| Improving the Performance of the GMRES Method using Mixed-Precision Techniques | 2020-11-03 | Paper |
Linear systems solvers for distributed-memory machines with GPU accelerators Lecture Notes in Computer Science | 2020-07-20 | Paper |
The singular value decomposition: anatomy of optimizing an algorithm for extreme scale SIAM Review | 2018-11-12 | Paper |
Stability and Performance of Various Singular Value QR Implementations on Multicore CPU with a GPU ACM Transactions on Mathematical Software | 2018-07-20 | Paper |
ParILUT---A New Parallel Threshold ILU Factorization SIAM Journal on Scientific Computing | 2018-07-18 | Paper |
| High-performance matrix-matrix multiplications of very small matrices | 2018-01-11 | Paper |
Highly Scalable Self-Healing Algorithms for High Performance Scientific Computing IEEE Transactions on Computers | 2017-08-08 | Paper |
Rectangular full packed format for Cholesky's algorithm: factorization, solution, and inversion ACM Transactions on Mathematical Software | 2017-05-19 | Paper |
Updating incomplete factorization preconditioners for model order reduction Numerical Algorithms | 2016-11-18 | Paper |
Linear algebra software for large-scale accelerated multicore computing Acta Numerica | 2016-07-08 | Paper |
Accelerating numerical dense linear algebra calculations with GPUs Numerical Computations with GPUs | 2015-07-03 | Paper |
Mixed-Precision Cholesky QR Factorization and Its Case Studies on Multicore CPU with Multiple GPUs SIAM Journal on Scientific Computing | 2015-06-10 | Paper |
Communication-avoiding symmetric-indefinite factorization SIAM Journal on Matrix Analysis and Applications | 2015-04-21 | Paper |
Level-3 Cholesky factorization routines improve performance of many Cholesky algorithms ACM Transactions on Mathematical Software | 2014-09-12 | Paper |
High-performance bidiagonal reduction using tile algorithms on homogeneous multicore architectures ACM Transactions on Mathematical Software | 2014-09-12 | Paper |
Accelerating Linear System Solutions Using Randomization Techniques ACM Transactions on Mathematical Software | 2014-09-12 | Paper |
| Designing LU-QR hybrid solvers for performance and stability | 2014-01-21 | Paper |
Changes in dense linear algebra kernels: decades-long perspective Solving the Schrödinger Equation | 2013-09-26 | Paper |
Toward a high performance tile divide and conquer algorithm for the dense symmetric eigenvalue problem SIAM Journal on Scientific Computing | 2013-03-06 | Paper |
| scientific article; zbMATH DE number 6118200 (Why is no real title available?) | 2012-12-23 | Paper |
High-performance computing systems: status and outlook Acta Numerica | 2012-10-12 | Paper |
Divide and conquer on hybrid GPU-accelerated multicore systems SIAM Journal on Scientific Computing | 2012-08-23 | Paper |
High-performance high-resolution semi-Lagrangian tracer transport on a sphere Journal of Computational Physics | 2011-12-28 | Paper |
Computing the conditioning of the components of a linear least-squares solution Numerical Linear Algebra with Applications | 2011-06-29 | Paper |
Computing the conditioning of the components of a linear least-squares solution Numerical Linear Algebra with Applications | 2011-06-29 | Paper |
Towards an efficient tile matrix inversion of symmetric positive definite matrices on multicore architectures Lecture Notes in Computer Science | 2011-03-08 | Paper |
A Scalable High Performant Cholesky Factorization for Multicore with GPU Accelerators Lecture Notes in Computer Science | 2011-03-08 | Paper |
Accelerating GPU kernels for dense linear algebra Lecture Notes in Computer Science | 2011-03-08 | Paper |
Accelerating the reduction to upper Hessenberg, tridiagonal, and bidiagonal forms through hybrid GPU-based computing Parallel Computing | 2010-11-26 | Paper |
Accelerating scientific computations with mixed precision algorithms Computer Physics Communications | 2010-10-28 | Paper |
Towards dense linear algebra for hybrid GPU accelerated manycore systems Parallel Computing | 2010-09-02 | Paper |
REVISITING MATRIX PRODUCT ON MASTER-WORKER PLATFORMS International Journal of Foundations of Computer Science | 2009-02-26 | Paper |
Using Mixed Precision for Sparse Matrix Computations to Enhance the Performance while Achieving 64-bit Accuracy ACM Transactions on Mathematical Software | 2008-12-21 | Paper |
State-of-the-art eigensolvers for electronic structure calculations of large scale nano-systems Journal of Computational Physics | 2008-07-29 | Paper |
| The Problem with the Linpack Benchmark Matrix Generator | 2008-06-30 | Paper |
The use of bulk states to accelerate the band edge state calculation of a semiconductor quantum dot Journal of Computational Physics | 2007-05-23 | Paper |
Large-Scale Scientific Computing Lecture Notes in Computer Science | 2006-11-21 | Paper |
Condition Numbers of Gaussian Random Matrices SIAM Journal on Matrix Analysis and Applications | 2006-05-31 | Paper |
Computational Science - ICCS 2004 Lecture Notes in Computer Science | 2005-12-23 | Paper |
Computational Science – ICCS 2005 Lecture Notes in Computer Science | 2005-11-30 | Paper |
Computational Science – ICCS 2005 Lecture Notes in Computer Science | 2005-11-30 | Paper |
Euro-Par 2004 Parallel Processing Lecture Notes in Computer Science | 2005-08-23 | Paper |
| scientific article; zbMATH DE number 2089146 (Why is no real title available?) | 2004-08-12 | Paper |
| scientific article; zbMATH DE number 2013552 (Why is no real title available?) | 2003-12-04 | Paper |
| scientific article; zbMATH DE number 1966253 (Why is no real title available?) | 2003-08-18 | Paper |
| scientific article; zbMATH DE number 1966257 (Why is no real title available?) | 2003-08-18 | Paper |
| scientific article; zbMATH DE number 1948419 (Why is no real title available?) | 2003-07-13 | Paper |
Automatic translation of Fortran to JVM bytecode Concurrency and Computation: Practice and Experience | 2003-03-25 | Paper |
Key concepts for parallel out-of-core LU factorization Computers & Mathematics with Applications | 2003-03-19 | Paper |
NetBuild: transparent cross‐platform access to computational software libraries Concurrency and Computation: Practice and Experience | 2003-02-20 | Paper |
Innovations of the NetSolve Grid Computing System Concurrency and Computation: Practice and Experience | 2003-02-20 | Paper |
| The design and implementation of the parallel out-of-core ScaLAPACK LU, QR, and Cholesky factorization routines | 2003-02-04 | Paper |
HARNESS fault tolerant MPI design, usage and performance issues Future Generation Computer Systems | 2003-01-21 | Paper |
Middleware for the use of storage in communication. Parallel Computing | 2003-01-21 | Paper |
| scientific article; zbMATH DE number 1849116 (Why is no real title available?) | 2003-01-06 | Paper |
| scientific article; zbMATH DE number 1844537 (Why is no real title available?) | 2002-12-12 | Paper |
| scientific article; zbMATH DE number 1792117 (Why is no real title available?) | 2002-08-28 | Paper |
Static tiling for heterogeneous computing platforms. Parallel Computing | 2002-07-25 | Paper |
Clusters and computational grids for scientific computing Parallel Computing | 2002-07-14 | Paper |
Telescoping languages: A strategy for automatic generation of scientific problem-solving systems from annotated libraries Journal of Parallel and Distributed Computing | 2002-07-04 | Paper |
| scientific article; zbMATH DE number 1729262 (Why is no real title available?) | 2002-07-02 | Paper |
| scientific article; zbMATH DE number 1760106 (Why is no real title available?) | 2002-06-25 | Paper |
| scientific article; zbMATH DE number 1728286 (Why is no real title available?) | 2002-04-15 | Paper |
HARNESS and fault tolerant MPI Parallel Computing | 2002-03-03 | Paper |
LAPACK95 user's guide Software - Environments - Tools | 2002-02-18 | Paper |
Automated empirical optimizations of software and the ATLAS project Parallel Computing | 2001-08-20 | Paper |
Numerical linear algebra algorithms and software Journal of Computational and Applied Mathematics | 2000-12-19 | Paper |
| scientific article; zbMATH DE number 1419220 (Why is no real title available?) | 2000-07-20 | Paper |
| scientific article; zbMATH DE number 1404622 (Why is no real title available?) | 2000-06-25 | Paper |
| scientific article; zbMATH DE number 1424358 (Why is no real title available?) | 2000-03-23 | Paper |
| scientific article; zbMATH DE number 1810081 (Why is no real title available?) | 2000-01-01 | Paper |
A Parallel Divide and Conquer Algorithm for the Symmetric Eigenvalue Problem on Distributed Memory Architectures SIAM Journal on Scientific Computing | 1999-11-24 | Paper |
Using agent-based software for scientific computing in the NetSolve system Parallel Computing | 1999-01-12 | Paper |
| Numerical Linear Algebra for High-Performance Computers | 1998-10-05 | Paper |
Key concepts for parallel out-of-core LU factorization Parallel Computing | 1998-07-23 | Paper |
A set of level 3 basic linear algebra subprograms ACM Transactions on Mathematical Software | 1998-03-23 | Paper |
Algorithm 679: A set of level 3 basic linear algebra subprograms: model implementation and test programs ACM Transactions on Mathematical Software | 1998-03-23 | Paper |
Algorithm 710: FORTRAN subroutines for computing the eigenvalues and eigenvectors of a general matrix by reduction to general tridiagonal form ACM Transactions on Mathematical Software | 1998-02-09 | Paper |
Software distribution using Xnetlib ACM Transactions on Mathematical Software | 1998-01-26 | Paper |
Software Libraries for Linear Algebra Computations on High Performance Computers SIAM Review | 1997-11-02 | Paper |
Chebyshev tau-QZ algorithm methods for calculating spectra of hydrodynamic stability problems Applied Numerical Mathematics | 1997-08-14 | Paper |
A parallel algorithm for the reduction of a nonsymmetric matrix to block upper-Hessenberg form Parallel Computing | 1997-02-28 | Paper |
Parallel matrix transpose algorithms on distributed memory concurrent computers Parallel Computing | 1997-02-28 | Paper |
Algorithmic bombardment for the iterative solution of linear systems: A poly-iterative approach Journal of Computational and Applied Mathematics | 1997-01-07 | Paper |
| scientific article; zbMATH DE number 924427 (Why is no real title available?) | 1996-09-05 | Paper |
| scientific article; zbMATH DE number 833758 (Why is no real title available?) | 1996-06-17 | Paper |
The design of a parallel dense linear algebra software library: Reduction to Hessenberg, tridiagonal, and bidiagonal form Numerical Algorithms | 1996-06-16 | Paper |
| scientific article; zbMATH DE number 819139 (Why is no real title available?) | 1996-03-05 | Paper |
| scientific article; zbMATH DE number 733664 (Why is no real title available?) | 1995-03-13 | Paper |
The PVM concurrent computing system: Evolution, experiences, and trends Parallel Computing | 1995-01-29 | Paper |
| scientific article; zbMATH DE number 556482 (Why is no real title available?) | 1994-12-04 | Paper |
| scientific article; zbMATH DE number 556483 (Why is no real title available?) | 1994-10-09 | Paper |
| scientific article; zbMATH DE number 434602 (Why is no real title available?) | 1993-11-15 | Paper |
A Parallel Algorithm for the Nonsymmetric Eigenvalue Problem SIAM Journal on Scientific Computing | 1993-11-11 | Paper |
Reduction to condensed form for the eigenvalue problem on distributed memory architectures Parallel Computing | 1993-01-17 | Paper |
| scientific article; zbMATH DE number 48972 (Why is no real title available?) | 1992-09-17 | Paper |
Numerical Considerations in Computing Invariant Subspaces SIAM Journal on Matrix Analysis and Applications | 1992-06-28 | Paper |
| scientific article; zbMATH DE number 4184914 (Why is no real title available?) | 1990-01-01 | Paper |
Block reduction of matrices to condensed forms for eigenvalue computations Journal of Computational and Applied Mathematics | 1989-01-01 | Paper |
| scientific article; zbMATH DE number 4153773 (Why is no real title available?) | 1988-01-01 | Paper |
Algorithm 656: an extended set of basic linear algebra subprograms: model implementation and test programs ACM Transactions on Mathematical Software | 1988-01-01 | Paper |
An extended set of FORTRAN basic linear algebra subprograms ACM Transactions on Mathematical Software | 1988-01-01 | Paper |
Corrigenda: “An Extended Set of FORTRAN Basic Linear Algebra Subprograms” ACM Transactions on Mathematical Software | 1988-01-01 | Paper |
| scientific article; zbMATH DE number 4058726 (Why is no real title available?) | 1988-01-01 | Paper |
Programming methodology and performance issues for advanced computer architectures Parallel Computing | 1988-01-01 | Paper |
Tools to aid in the analysis of memory access patterns for FORTRAN programs Parallel Computing | 1988-01-01 | Paper |
| scientific article; zbMATH DE number 4092648 (Why is no real title available?) | 1987-01-01 | Paper |
A Fully Parallel Algorithm for the Symmetric Eigenvalue Problem SIAM Journal on Scientific and Statistical Computing | 1987-01-01 | Paper |
Solving banded systems on a parallel processor Parallel Computing | 1987-01-01 | Paper |
Implementation of some concurrent algorithms for matrix factorization Parallel Computing | 1986-01-01 | Paper |
Implementing Dense Linear Algebra Algorithms Using Multitasking on the CRAY X-MP-4 (or Approaching the Gigaflop) SIAM Journal on Scientific and Statistical Computing | 1986-01-01 | Paper |
Squeezing the most out of eigenvalue solvers on high-performance computers Linear Algebra and its Applications | 1986-01-01 | Paper |
Linear algebra on high performance computers Applied Mathematics and Computation | 1986-01-01 | Paper |
On some parallel banded system solvers Parallel Computing | 1985-01-01 | Paper |
A collection of parallel linear equations routines for the Denelcor HEP Parallel Computing | 1984-01-01 | Paper |
| scientific article; zbMATH DE number 3883499 (Why is no real title available?) | 1984-01-01 | Paper |
Improving the Accuracy of Computed Eigenvalues and Eigenvectors SIAM Journal on Numerical Analysis | 1983-01-01 | Paper |
Improving the Accuracy of Computed Singular Values SIAM Journal on Scientific and Statistical Computing | 1983-01-01 | Paper |
Algorithm 589: SICEDR : A FORTRAN Subroutine for Improving the Accuracy of Computed Matrix Eigenvalues ACM Transactions on Mathematical Software | 1982-01-01 | Paper |
| scientific article; zbMATH DE number 3748409 (Why is no real title available?) | 1980-01-01 | Paper |
Unrolling loops in fortran Software: Practice and Experience | 1979-01-01 | Paper |
Matrix eigensystem routines. EISPACK guide extension Lecture Notes in Computer Science | 1977-01-01 | Paper |
A hybrid Hermitian general eigenvalue solver (available as arXiv preprint) | N/A | Paper |