swMATH4898MaRDI QIDQ17049FDOQ17049
Author name not available (Why is that?)
Official website: http://www.icsi.berkeley.edu/~bilmes/phipac/
Cited In (69)
- GPTune
- A framework for high-performance matrix multiplication based on hierarchical abstractions, algorithms and optimized low-level kernels
- Title not available (Why is that?)
- Title not available (Why is that?)
- Recursive Blocked Algorithms and Hybrid Data Structures for Dense Matrix Library Software
- An Asynchronous Parallel Supernodal Algorithm for Sparse Gaussian Elimination
- A recursive formulation of Cholesky factorization of a matrix in packed storage
- Title not available (Why is that?)
- Combined selection of tile sizes and unroll factors using iterative compilation
- Title not available (Why is that?)
- Design, implementation and testing of extended and mixed precision BLAS
- Lowest common ancestors in trees and directed acyclic graphs
- BLIS: a framework for rapidly instantiating BLAS functionality
- Title not available (Why is that?)
- Title not available (Why is that?)
- Title not available (Why is that?)
- Title not available (Why is that?)
- Title not available (Why is that?)
- Title not available (Why is that?)
- Title not available (Why is that?)
- Title not available (Why is that?)
- A Supernodal Approach to Sparse Partial Pivoting
- ScaLAPACK Users' Guide
- Cache optimization for structured and unstructured grid multigrid
- Large-Scale Scientific Computing
- Title not available (Why is that?)
- Optimizing locality and scalability of embedded Runge-Kutta solvers using block-based pipelining
- Computational Science - ICCS 2004
- Emmerald: a fast matrix–matrix multiply using Intel's SSE instructions
- FLAME
- Towards performance evaluation of high-performance computing on multiple Java platforms
- PSPASES
- Title not available (Why is that?)
- PAPI
- OSKI
- Finding least common ancestors in directed acyclic graphs
- Formal derivation of algorithms
- PUMMA
- DynTile
- BLIS
- PLuTo
- Algorithm 679
- GotoBLAS
- BLISlab
- Emmerald
- OProfile
- Algorithm 784
- Algorithm 656
- SuperMatrix
- ADAPT
- ESSL
- PMLP
- Adaptive Winograd's matrix multiplications
- Communication lower bounds for distributed-memory matrix multiplication
- Reliable generation of high-performance matrix algebra
- Optimization of algorithms with OPAL
- When cache blocking of sparse matrix vector multiply works and why
- Automated empirical optimizations of software and the ATLAS project
- Analytical modeling is enough for high-performance BLIS
- Title not available (Why is that?)
- Title not available (Why is that?)
- Title not available (Why is that?)
- Title not available (Why is that?)
- An efficient time-step-based self-adaptive algorithm for predictor-corrector methods of Runge-Kutta type
- rchol
- Accurate Symmetric Indefinite Linear Equation Solvers
- A methodology for speeding up loop kernels by exploiting the software information and the memory architecture
- Cache-aware multigrid methods for solving Poisson's equation in two dimensions
- Distribution of a class of divide and conquer recurrences arising from the computation of the Walsh-Hadamard transform
This page was built for software: PHiPAC