Publication | Date of Publication | Type |
---|
Generalizing Random Butterfly Transforms to Arbitrary Matrix Sizes | 2023-12-14 | Paper |
Mixed-Precision Orthogonalization Scheme and Adaptive Step Size for Improving the Stability and Performance of CA-GMRES on GPUs | 2022-12-09 | Paper |
Heterogenous Acceleration for Linear Algebra in Multi-coprocessor Environments | 2022-12-09 | Paper |
Accelerating Computation of Eigenvectors in the Dense Nonsymmetric Eigenvalue Problem | 2022-12-09 | Paper |
A Set of Batched Basic Linear Algebra Subprograms and LAPACK Routines | 2022-02-01 | Paper |
Mixed-precision iterative refinement using tensor cores on GPUs to accelerate solution of linear systems | 2021-10-29 | Paper |
Numerical algorithms for high-performance computational science | 2021-06-15 | Paper |
Improving the Performance of the GMRES Method using Mixed-Precision Techniques | 2020-11-03 | Paper |
Linear Systems Solvers for Distributed-Memory Machines with GPU Accelerators | 2020-07-20 | Paper |
The Singular Value Decomposition: Anatomy of Optimizing an Algorithm for Extreme Scale | 2018-11-12 | Paper |
Stability and Performance of Various Singular Value QR Implementations on Multicore CPU with a GPU | 2018-07-20 | Paper |
ParILUT---A New Parallel Threshold ILU Factorization | 2018-07-18 | Paper |
High-performance matrix-matrix multiplications of very small matrices | 2018-01-11 | Paper |
Highly Scalable Self-Healing Algorithms for High Performance Scientific Computing | 2017-08-08 | Paper |
Rectangular full packed format for cholesky's algorithm | 2017-05-19 | Paper |
Updating incomplete factorization preconditioners for model order reduction | 2016-11-18 | Paper |
Linear algebra software for large-scale accelerated multicore computing | 2016-07-08 | Paper |
Accelerating Numerical Dense Linear Algebra Calculations with GPUs | 2015-07-03 | Paper |
Mixed-Precision Cholesky QR Factorization and Its Case Studies on Multicore CPU with Multiple GPUs | 2015-06-10 | Paper |
Communication-Avoiding Symmetric-Indefinite Factorization | 2015-04-21 | Paper |
Accelerating Linear System Solutions Using Randomization Techniques | 2014-09-12 | Paper |
Level-3 Cholesky Factorization Routines Improve Performance of Many Cholesky Algorithms | 2014-09-12 | Paper |
High-performance bidiagonal reduction using tile algorithms on homogeneous multicore architectures | 2014-09-12 | Paper |
Designing LU-QR hybrid solvers for performance and stability | 2014-01-21 | Paper |
Changes in Dense Linear Algebra Kernels: Decades-Long Perspective | 2013-09-26 | Paper |
Toward a High Performance Tile Divide and Conquer Algorithm for the Dense Symmetric Eigenvalue Problem | 2013-03-06 | Paper |
https://portal.mardi4nfdi.de/entity/Q3145773 | 2012-12-23 | Paper |
High-performance computing systems: Status and outlook | 2012-10-12 | Paper |
Divide and Conquer on Hybrid GPU-Accelerated Multicore Systems | 2012-08-23 | Paper |
High-performance high-resolution semi-Lagrangian tracer transport on a sphere | 2011-12-28 | Paper |
Computing the conditioning of the components of a linear least-squares solution | 2011-06-29 | Paper |
Accelerating GPU Kernels for Dense Linear Algebra | 2011-03-08 | Paper |
A Scalable High Performant Cholesky Factorization for Multicore with GPU Accelerators | 2011-03-08 | Paper |
Towards an Efficient Tile Matrix Inversion of Symmetric Positive Definite Matrices on Multicore Architectures | 2011-03-08 | Paper |
Accelerating the reduction to upper Hessenberg, tridiagonal, and bidiagonal forms through hybrid GPU-based computing | 2010-11-26 | Paper |
Accelerating scientific computations with mixed precision algorithms | 2010-10-28 | Paper |
Towards dense linear algebra for hybrid GPU accelerated manycore systems | 2010-09-02 | Paper |
REVISITING MATRIX PRODUCT ON MASTER-WORKER PLATFORMS | 2009-02-26 | Paper |
Using Mixed Precision for Sparse Matrix Computations to Enhance the Performance while Achieving 64-bit Accuracy | 2008-12-21 | Paper |
State-of-the-art eigensolvers for electronic structure calculations of large scale nano-systems | 2008-07-29 | Paper |
The Problem with the Linpack Benchmark Matrix Generator | 2008-06-30 | Paper |
The use of bulk states to accelerate the band edge state calculation of a semiconductor quantum dot | 2007-05-23 | Paper |
Large-Scale Scientific Computing | 2006-11-21 | Paper |
Condition Numbers of Gaussian Random Matrices | 2006-05-31 | Paper |
Computational Science - ICCS 2004 | 2005-12-23 | Paper |
Computational Science – ICCS 2005 | 2005-11-30 | Paper |
Computational Science – ICCS 2005 | 2005-11-30 | Paper |
Euro-Par 2004 Parallel Processing | 2005-08-23 | Paper |
https://portal.mardi4nfdi.de/entity/Q3046367 | 2004-08-12 | Paper |
https://portal.mardi4nfdi.de/entity/Q4436930 | 2003-12-04 | Paper |
https://portal.mardi4nfdi.de/entity/Q4420596 | 2003-08-18 | Paper |
https://portal.mardi4nfdi.de/entity/Q4420604 | 2003-08-18 | Paper |
https://portal.mardi4nfdi.de/entity/Q4411983 | 2003-07-13 | Paper |
Automatic translation of Fortran to JVM bytecode | 2003-03-25 | Paper |
Key concepts for parallel out-of-core LU factorization | 2003-03-19 | Paper |
NetBuild: transparent cross‐platform access to computational software libraries | 2003-02-20 | Paper |
Innovations of the NetSolve Grid Computing System | 2003-02-20 | Paper |
The design and implementation of the parallel out-of-core ScaLAPACK LU, QR, and Cholesky factorization routines | 2003-02-04 | Paper |
Middleware for the use of storage in communication. | 2003-01-21 | Paper |
HARNESS fault tolerant MPI design, usage and performance issues | 2003-01-21 | Paper |
https://portal.mardi4nfdi.de/entity/Q4787342 | 2003-01-06 | Paper |
https://portal.mardi4nfdi.de/entity/Q4784913 | 2002-12-12 | Paper |
https://portal.mardi4nfdi.de/entity/Q4552060 | 2002-08-28 | Paper |
Static tiling for heterogeneous computing platforms. | 2002-07-25 | Paper |
Clusters and computational grids for scientific computing | 2002-07-14 | Paper |
Telescoping languages: A strategy for automatic generation of scientific problem-solving systems from annotated libraries | 2002-07-04 | Paper |
https://portal.mardi4nfdi.de/entity/Q2780668 | 2002-07-02 | Paper |
https://portal.mardi4nfdi.de/entity/Q4537068 | 2002-06-25 | Paper |
https://portal.mardi4nfdi.de/entity/Q2779333 | 2002-04-15 | Paper |
HARNESS and fault tolerant MPI | 2002-03-03 | Paper |
LAPACK95 Users' Guide | 2002-02-18 | Paper |
Automated empirical optimizations of software and the ATLAS project | 2001-08-20 | Paper |
Numerical linear algebra algorithms and software | 2000-12-19 | Paper |
https://portal.mardi4nfdi.de/entity/Q4942238 | 2000-07-20 | Paper |
https://portal.mardi4nfdi.de/entity/Q4938107 | 2000-06-25 | Paper |
https://portal.mardi4nfdi.de/entity/Q4945576 | 2000-03-23 | Paper |
https://portal.mardi4nfdi.de/entity/Q3147911 | 2000-01-01 | Paper |
A Parallel Divide and Conquer Algorithm for the Symmetric Eigenvalue Problem on Distributed Memory Architectures | 1999-11-24 | Paper |
Using agent-based software for scientific computing in the NetSolve system | 1999-01-12 | Paper |
Numerical Linear Algebra for High-Performance Computers | 1998-10-05 | Paper |
Key concepts for parallel out-of-core LU factorization | 1998-07-23 | Paper |
A set of level 3 basic linear algebra subprograms | 1998-03-23 | Paper |
Algorithm 679: A set of level 3 basic linear algebra subprograms: model implementation and test programs | 1998-03-23 | Paper |
Algorithm 710: FORTRAN subroutines for computing the eigenvalues and eigenvectors of a general matrix by reduction to general tridiagonal form | 1998-02-09 | Paper |
Software distribution using Xnetlib | 1998-01-26 | Paper |
Software Libraries for Linear Algebra Computations on High Performance Computers | 1997-11-02 | Paper |
Chebyshev tau-QZ algorithm methods for calculating spectra of hydrodynamic stability problems | 1997-08-14 | Paper |
A parallel algorithm for the reduction of a nonsymmetric matrix to block upper-Hessenberg form | 1997-02-28 | Paper |
Parallel matrix transpose algorithms on distributed memory concurrent computers | 1997-02-28 | Paper |
Algorithmic bombardment for the iterative solution of linear systems: A poly-iterative approach | 1997-01-07 | Paper |
https://portal.mardi4nfdi.de/entity/Q4891458 | 1996-09-05 | Paper |
https://portal.mardi4nfdi.de/entity/Q4860266 | 1996-06-17 | Paper |
The design of a parallel dense linear algebra software library: Reduction to Hessenberg, tridiagonal, and bidiagonal form | 1996-06-16 | Paper |
https://portal.mardi4nfdi.de/entity/Q4855980 | 1996-03-05 | Paper |
https://portal.mardi4nfdi.de/entity/Q4325973 | 1995-03-13 | Paper |
The PVM concurrent computing system: Evolution, experiences, and trends | 1995-01-29 | Paper |
https://portal.mardi4nfdi.de/entity/Q4288941 | 1994-12-04 | Paper |
https://portal.mardi4nfdi.de/entity/Q4288943 | 1994-10-09 | Paper |
https://portal.mardi4nfdi.de/entity/Q3139427 | 1993-11-15 | Paper |
A Parallel Algorithm for the Nonsymmetric Eigenvalue Problem | 1993-11-11 | Paper |
Reduction to condensed form for the eigenvalue problem on distributed memory architectures | 1993-01-17 | Paper |
https://portal.mardi4nfdi.de/entity/Q3997799 | 1992-09-17 | Paper |
Numerical Considerations in Computing Invariant Subspaces | 1992-06-28 | Paper |
https://portal.mardi4nfdi.de/entity/Q5750327 | 1990-01-01 | Paper |
Block reduction of matrices to condensed forms for eigenvalue computations | 1989-01-01 | Paper |
Programming methodology and performance issues for advanced computer architectures | 1988-01-01 | Paper |
Tools to aid in the analysis of memory access patterns for FORTRAN programs | 1988-01-01 | Paper |
https://portal.mardi4nfdi.de/entity/Q3482749 | 1988-01-01 | Paper |
An extended set of FORTRAN basic linear algebra subprograms | 1988-01-01 | Paper |
Algorithm 656: an extended set of basic linear algebra subprograms: model implementation and test programs | 1988-01-01 | Paper |
https://portal.mardi4nfdi.de/entity/Q3793601 | 1988-01-01 | Paper |
Corrigenda: “An Extended Set of FORTRAN Basic Linear Algebra Subprograms” | 1988-01-01 | Paper |
Solving banded systems on a parallel processor | 1987-01-01 | Paper |
A Fully Parallel Algorithm for the Symmetric Eigenvalue Problem | 1987-01-01 | Paper |
https://portal.mardi4nfdi.de/entity/Q3819886 | 1987-01-01 | Paper |
Squeezing the most out of eigenvalue solvers on high-performance computers | 1986-01-01 | Paper |
Implementation of some concurrent algorithms for matrix factorization | 1986-01-01 | Paper |
Linear algebra on high performance computers | 1986-01-01 | Paper |
Implementing Dense Linear Algebra Algorithms Using Multitasking on the CRAY X-MP-4 (or Approaching the Gigaflop) | 1986-01-01 | Paper |
On some parallel banded system solvers | 1985-01-01 | Paper |
A collection of parallel linear equations routines for the Denelcor HEP | 1984-01-01 | Paper |
https://portal.mardi4nfdi.de/entity/Q3217518 | 1984-01-01 | Paper |
Improving the Accuracy of Computed Singular Values | 1983-01-01 | Paper |
Improving the Accuracy of Computed Eigenvalues and Eigenvectors | 1983-01-01 | Paper |
Algorithm 589: SICEDR : A FORTRAN Subroutine for Improving the Accuracy of Computed Matrix Eigenvalues | 1982-01-01 | Paper |
https://portal.mardi4nfdi.de/entity/Q3932291 | 1980-01-01 | Paper |
Unrolling loops in fortran | 1979-01-01 | Paper |
Matrix eigensystem routines. EISPACK guide extension | 1977-01-01 | Paper |
A hybrid Hermitian general eigenvalue solver | 0001-01-03 | Paper |