NAS Parallel Benchmarks
swMATH8853MaRDI QIDQ20852FDOQ20852
Official website: https://www.nas.nasa.gov/software/npb.html
The NAS Parallel Benchmarks (NPB) are a small set of programs designed to help evaluate the performance of parallel supercomputers. The benchmarks are derived from computational fluid dynamics (CFD) applications and consist of five kernels and three pseudo-applications in the original "pencil-and-paper" specification (NPB 1). The benchmark suite has been extended to include new benchmarks for unstructured adaptive meshes, parallel I/O, multi-zone applications, and computational grids. Problem sizes in NPB are predefined and indicated as different classes. Reference implementations of NPB are available in commonly-used programming models like MPI and OpenMP (NPB 2 and NPB 3).
Cited In (72)
- Performance modeling of hybrid MPI/OpenMP scientific applications on large-scale multicore supercomputers
- Self-similarity of parallel machines
- Topology-aware strategy for MPI-IO operations in clusters
- Experience in using SIMD and MIMD parallelism for computational fluid dynamics
- Performance evaluation of mixed-mode OpenMP/MPI implementations
- A parallel finite element method for the analysis of crystalline solids
- Parallel iterative solvers for unstructured grids using a directive/MPI hybrid programming model for the GeoFEM platform on SMP cluster architectures
- Title not available (Why is that?)
- Parallel simulation of electron-solid interactions for electron microscopy modeling
- Capturing and analyzing the execution control flow of OpenMP applications
- MPI correctness checking for OpenMP/MPI applications
- Design and performance of a scheduling framework for resizable parallel applications
- A speculative and adaptive MPI rendezvous protocol over RDMA-enabled interconnects
- Interconnection network simulation using traces of MPI applications
- LogGPO: an accurate communication model for performance prediction of MPI programs
- Circular-arc graph coloring: On chords and circuits in the meeting graph
- Bsp2omp: A Compiler For Translating Bsp Programs To Openmp
- Algorithms for the parallel alternating direction access machine
- Implementation of parallel plasma particle-in-cell codes on PC cluster
- VXDL: Virtual Resources and Interconnection Networks Description Language
- A proposal for error handling in OpenMP
- Parallelization and optimization of Mfold on shared memory system
- A two-stage hardware scheduler combining greedy and optimal scheduling
- Algorithm-system scalability of heterogeneous computing
- Parallelization of a multiblock flow code: An engineering implementation
- Using cost to control instrumentation overhead
- Computational fluid dynamics applications on parallel-vector computers: Computations of stirred vessel flows
- Model-based fault localization: finding behavioral outliers in large-scale computing systems
- Unstructured adaptive meshes: Bad for your memory?
- A parallelized ENO procedure for direct numerical simulation of compressible turbulence
- A detailed analysis of communication load balance on BlueGene supercomputer
- HPF/JA: extensions of High Performance Fortran for accelerating real‐world applications
- MPI-CHECK: a tool for checking Fortran 90 MPI programs
- High-scalability parallelization of a molecular modeling application: Performance and productivity comparison between OpenMP and MPI implementations
- Reducing division latency with reciprocal caches
- PARALLEL CFD BENCHMARKS ON CRAY COMPUTERS
- SAC -- a functional array language for efficient multi-threaded execution
- Improved upper bounds for online malleable job scheduling
- Online scheduling of malleable parallel jobs with setup times on two identical machines
- Code modernization strategies to 3-D stencil-based applications on intel Xeon Phi: KNC and KNL
- Title not available (Why is that?)
- Dynamic data prefetching in home-based software DSMs
- An unsteady incompressible Navier-Stokes solver for large eddy simulation of turbulent flows
- Design and implementation of an agent home scheme strategy for prefetch-based DSM systems
- Porting and performance evaluation of irregular codes using OpenMP
- Direct and inverse problems of high-viscosity fluid dynamics
- Performance characteristics of the multi-zone NAS parallel benchmarks
- Online malleable job scheduling for \(m\leq 3\)
- An object-oriented parallel programming language for distributed-memory parallel computing platforms
- Performance evaluation of a multi-zone application in different openmp approaches
- Performance advantage of reconfigurable cache design on multicore processor systems
- Supporting openmp on cell
- Parallel benchmarks of turbulence in complex geometries
- Comments on PVPs, MPPs, NOWS, and future computer architectures
- Deadlock detection in MPI programs
- Adaptive execution techniques of parallel programs for multiprocessors
- Failure-aware resource management for high-availability computing clusters with distributed virtual machines
- A session key caching and prefetching scheme for secure communication in cluster systems
- CACHING IN WITH MULTIGRID ALGORITHMS: PROBLEMS IN TWO DIMENSIONS
- Comments on PVPs, MPPs, NOWS, and future computer architectures
- Advanced optimization strategies in the Rice dHPF compiler
- Efficient communication using message prediction for clusters of multiprocessors
- Implementation and evaluation of HPF/SX V2
- Redistribution strategies for portable parallel FFT: a case study
- Data optimizations for constraint automata
- Implementation and evaluation of a communication intensive application on the EARTH multithreaded system
- Parallel 3D mortar element method for adaptive nonconforming meshes
- Title not available (Why is that?)
- Title not available (Why is that?)
- Techniques for compiling and implementing all NAS parallel benchmarks in HPF
- VPP Fortran and the design of HPF/JA extensions
- Title not available (Why is that?)
This page was built for software: NAS Parallel Benchmarks