PAPI
From MaRDI portal
swMATH5951MaRDI QIDQ18089FDOQ18089
Author name not available (Why is that?)
Official website: http://icl.cs.utk.edu/papi/
Cited In (65)
- Instrumentation database system for performance analysis of parallel scientific applications
- Scalarization using loop alignment and loop skewing
- Title not available (Why is that?)
- Capturing and analyzing the execution control flow of OpenMP applications
- An optimized sparse approximate matrix multiply for matrices with decay
- Engineering a combinatorial Laplacian solver: lessons learned
- Cube
- Fine-grained multithreading for the multifrontal \(QR\) factorization of sparse matrices
- Instruction-throughput regulation in computer processors with data-center applications
- Implementation and evaluation of global and partitioned scheduling in a real-time OS
- Towards an accurate performance modeling of parallel sparse factorization
- Performance comparison of HPX versus traditional parallelization strategies for the discontinuous Galerkin method
- PySPH: A Python-based Framework for Smoothed Particle Hydrodynamics
- DETECTING SECONDARY BOTTLENECKS IN PARALLEL QUANTUM CHEMISTRY APPLICATIONS USING MPI
- A distributed and incremental SVD algorithm for agglomerative data analysis on large networks
- PHiPAC
- OSKI
- New fast divide-and-conquer algorithms for the symmetric tridiagonal eigenvalue problem.
- PARSEC
- TAU
- HPCTOOLKIT
- DynTile
- Feather-Trace
- Mesquite
- Parallel simulations of dynamic fracture using extrinsic cohesive elements
- Scalasca
- AQuoSA
- STREAM2
- SimpleScalar
- Paralution
- hwloc
- Semi-stencil
- ALPBench
- ParVec
- Rodinia
- crs
- TASCEL
- Code modernization strategies to 3-D stencil-based applications on intel Xeon Phi: KNC and KNL
- OProfile
- CLTune
- RichardsFOAM
- Green
- GraphBIG
- XSadd
- gprof
- ParSWMS
- ADAPT
- SAGE
- mARGOt: A Dynamic Autotuning Framework for Self-Aware Approximate Computing
- LibGeoDecomp
- TOUGH2-MP
- Performance modeling of serial and parallel implementations of the fractional Adams-Bashforth-Moulton method
- When cache blocking of sparse matrix vector multiply works and why
- ParVec: vectorizing the PARSEC benchmark suite
- Data page layouts for relational databases on deep memory hierarchies
- An efficient time-step-based self-adaptive algorithm for predictor-corrector methods of Runge-Kutta type
- High order finite volume methods on wavelet-adapted grids with local time-stepping on multicore architectures for the simulation of shock-bubble interactions
- Compyle
- PyZoltan
- Optimized code generation for finite element local assembly using symbolic manipulation
- Algorithm 942
- Improving resource-unaware SAT solvers
- Enhancing speed and scalability of the ParFlow simulation code
- Performance comparison and workload analysis of mesh untangling and smoothing algorithms
- SCALEA: a performance analysis tool for parallel programs
This page was built for software: PAPI