Automated empirical optimizations of software and the ATLAS project
From MaRDI portal
Publication:5940972
DOI10.1016/S0167-8191(00)00087-9zbMath0971.68033OpenAlexW1964031104WikidataQ29392259 ScholiaQ29392259MaRDI QIDQ5940972
Jack J. Dongarra, R. Clint Whaley, Antoine Petitet
Publication date: 20 August 2001
Published in: Parallel Computing (Search for Journal in Brave)
Full work available at URL: https://doi.org/10.1016/s0167-8191(00)00087-9
Related Items
A Cache-Oblivious Sparse Matrix–Vector Multiplication Scheme Based on the Hilbert Curve, Finding graph embeddings by incremental low-rank semidefinite programming, The shifted number system for fast linear algebra on integer matrices, On the accuracy of finite-difference solutions for nonlinear water waves, Fast Parallel Algorithm for Polynomial Evaluation, The aggregation and cancellation techniques as a practical tool for faster matrix multiplication, An efficient parallel and fully implicit algorithm for the simulation of transient free-surface flows of multimode viscoelastic liquids, Burnett spectral method for the spatially homogeneous Boltzmann equation, The cache-oblivious Gaussian elimination paradigm: Theoretical framework, parallelization and Experimental evaluation, Floating-point arithmetic, Automated tuning for the parameters of linear solvers, Hyperparameter autotuning of programs with HybridTuner, Analytical modeling of matrix–vector multiplication on multicore processors, Computer algebra systems - new strategies and techniques, A methodology pruning the search space of six compiler transformations by addressing them together as one problem and by exploiting the hardware architecture details, Two-stage least squares and indirect least squares algorithms for simultaneous equations models, Parallel direct methods for solving the system of linear equations with pipelining on a multicore using OpenMP, Implementing High-Performance Complex Matrix Multiplication via the 1M Method, A note on probabilistic models over strings: the linear algebra approach, Error-free transformations of matrix multiplication by using fast routines of matrix multiplication and its applications, A generalized approach to linear transform approximations with applications to the discrete cosine transform, Optimization of algorithms with OPAL, A CUDA-based implementation of an improved SPH method on GPU, Benchmarking in data envelopment analysis: an approach based on genetic algorithms and parallel programming, Multi-stage programming with functors and monads: eliminating abstraction overhead from generic code, Optimization techniques for small matrix multiplication, Fast inclusion of interval matrix multiplication, Geostatistical hierarchical model for temporally integrated radon measurements, Newton-Krylov continuation of periodic orbits for Navier-Stokes flows, Exploiting semidefinite relaxations in constraint programming, Parallel direct Poisson solver for discretisations with one Fourier diagonalisable direction, A new numerical method for nonlocal electrostatics in biomolecular simulations, Quantum Monte Carlo on graphical processing units, Cache oblivious matrix multiplication using an element ordering based on a Peano curve, Nodal discontinuous Galerkin methods on graphics processors, A method for the automatic selection and tuning of the parameters of a sparse SLE solver, The mimetic methods toolkit: an object-oriented API for mimetic finite differences, Performance and Numerical Accuracy Evaluation of Heterogeneous Multicore Systems for Krylov Orthogonal Basis Computation, ATLAS, On the effect of linear algebra implementations in real-time multibody system dynamics, Performance-Based Numerical Solver Selection in the Lighthouse Framework, Optimal block-tridiagonalization of matrices for coherent charge transport, Numerical algorithms for high-performance computational science, Analytical Modeling Is Enough for High-Performance BLIS
Uses Software