The following pages link to CUBLAS (Q18949):
Displaying 50 items.
- Efficient and accurate algorithms for computing matrix trigonometric functions (Q313622) (← links)
- Performance models and workload distribution algorithms for optimizing a hybrid CPU-GPU multifrontal solver (Q316653) (← links)
- Updating incomplete factorization preconditioners for model order reduction (Q342863) (← links)
- Acceleration of early-photon fluorescence molecular tomography with graphics processing units (Q382551) (← links)
- On the GPGPU parallelization issues of finite element approximate inverse preconditioning (Q645719) (← links)
- Efficient GPU-based implementations of simplex type algorithms (Q902763) (← links)
- Graphics processing units and high-dimensional optimization (Q906530) (← links)
- Discrete particle swarm optimization for constructing uniform design on irregular regions (Q1623418) (← links)
- MPI-CUDA sparse matrix-vector multiplication for the conjugate gradient method with an approximate inverse preconditioner (Q1641244) (← links)
- GPU accelerated computational homogenization based on a variational approach in a reduced basis framework (Q1667319) (← links)
- A parallel computing method using blocked format with optimal partitioning for SpMV on GPU (Q1678174) (← links)
- GPU accelerated intensities MPI (GAIN-MPI): a new method of computing Einstein-\(A\) coefficients (Q1685835) (← links)
- Redesigning triangular dense matrix computations on GPUs (Q1693226) (← links)
- Efficient determination of the Markovian time-evolution towards a steady-state of a complex open quantum system (Q1737434) (← links)
- Cucheb: a GPU implementation of the filtered Lanczos procedure (Q1737460) (← links)
- GPU-accelerated algorithms for many-particle continuous-time quantum walks (Q1739618) (← links)
- A new efficient and accurate spline algorithm for the matrix exponential computation (Q1747317) (← links)
- An efficient and accurate algorithm for computing the matrix cosine based on new Hermite approximations (Q1757341) (← links)
- GPGPU-based parallel computing applied in the FEM using the conjugate gradient algorithm: a review (Q1789172) (← links)
- Higher order finite elements in space and time for anisotropic simulations with variational integrators. Application of an efficient GPU implementation (Q1997921) (← links)
- GPU optimization of large-scale eigenvalue solver (Q2008665) (← links)
- Efficient \(L_0\) resampling of point sets (Q2010365) (← links)
- High-performance statistical computing in the computing environments of the 2020s (Q2092893) (← links)
- HPMaX: heterogeneous parallel matrix multiplication using CPUs and GPUs (Q2212501) (← links)
- Parallel reduction of four matrices to condensed form for a generalized matrix eigenvalue algorithm (Q2219440) (← links)
- A GPU application for high-order compact finite difference scheme (Q2249551) (← links)
- GPU-based block-wise nonlocal means denoising for 3D ultrasound images (Q2262375) (← links)
- 3D data denoising via nonlocal means filter by using parallel GPU strategies (Q2262528) (← links)
- A heterogeneous parallel LU factorization algorithm based on a basic column block uniform allocation strategy (Q2298336) (← links)
- Fast and robust flow simulations in discrete fracture networks with gpgpus (Q2320411) (← links)
- Fast Taylor polynomial evaluation for the computation of the matrix cosine (Q2423580) (← links)
- Rapid re-meshing and re-solution of three-dimensional boundary element problems for interactive stress analysis (Q2520193) (← links)
- Compressed hierarchical Schur algorithm for frequency-domain analysis of photonic structures (Q2631769) (← links)
- New Hermite series expansion for computing the matrix hyperbolic cosine (Q2668029) (← links)
- Optimal size of the block in block GMRES on GPUs: computational model and experiments (Q2679654) (← links)
- Solving time-fractional reaction-diffusion systems through a tensor-based parallel algorithm (Q2683288) (← links)
- A \(\mu\)-mode BLAS approach for multidimensional tensor-structured problems (Q2691912) (← links)
- Development of a parallel CUDA algorithm for solving 3D guiding center problems (Q2695579) (← links)
- Highly efficient GPU eigensolver for three-dimensional photonic crystal band structures with any Bravais lattice (Q2696559) (← links)
- SCELib4.0: the new program version for computing molecular properties in the single center approach (Q2698820) (← links)
- Accelerated Dimension-Independent Adaptive Metropolis (Q2830629) (← links)
- A Fast Dense Triangular Solve in CUDA (Q2847757) (← links)
- Parallel and Heterogeneous $m$--Hessenberg--Triangular--Triangular Reduction (Q2954487) (← links)
- A Runtime System for Programming Out-of-Core Matrix Algorithms-by-Tiles on Multithreaded Architectures (Q2989167) (← links)
- Performance and Numerical Accuracy Evaluation of Heterogeneous Multicore Systems for Krylov Orthogonal Basis Computation (Q3081341) (← links)
- An Error Correction Solver for Linear Systems: Evaluation of Mixed Precision Implementations (Q3081343) (← links)
- Accelerating GPU Kernels for Dense Linear Algebra (Q3081346) (← links)
- A Scalable High Performant Cholesky Factorization for Multicore with GPU Accelerators (Q3081347) (← links)
- Accelerating the Explicitly Restarted Arnoldi Method with GPUs Using an Autotuned Matrix Vector Product (Q3116472) (← links)
- Exposing Fine-Grained Parallelism in Algebraic Multigrid Methods (Q3165439) (← links)