MKL
From MaRDI portal
Cited in
(only showing first 100 items - show all)- Multi-core CPUs, clusters, and grid computing: A tutorial
- Forward stable eigenvalue decomposition of rank-one modifications of diagonal matrices
- Solving the Faddeev-Merkuriev equations in total orbital momentum representation via spline collocation and tensor product preconditioning
- Manycore algorithms for batch scalar and block tridiagonal solvers
- A multishift, multipole rational QZ method with aggressive early deflation
- Acceleration of three-dimensional tokamak magnetohydrodynamical code with graphics processing unit and OpenACC heterogeneous parallel programming
- BiqBin: moving boundaries for NP-hard problems by HPC
- AMPS: real-time mesh cutting with augmented matrices for surgical simulations.
- A hybrid high-order method for flow simulations in discrete fracture networks
- Generation of large finite-element matrices on multiple graphics processors
- Tests with FALKSOL. A massively parallel multi-level domain decomposing direct solver
- Efficient alternating least squares algorithms for low multilinear rank approximation of tensors
- Employing AVX vectorization to improve the performance of random number generators
- HexGen and Hex2Spline: polycube-based hexahedral mesh generation and spline modeling for isogeometric analysis applications in LS-DYNA
- Multi-preconditioned domain decomposition methods in the Krylov subspaces
- Combined co-rotational beam/shell elements for fluid-structure interaction analysis of insect-like flapping wing
- JuSFEM: a Julia-based open-source package of parallel smoothed finite element method (S-FEM) for elastic problems
- PTEBEM for wave drift forces based on hydrodynamic pressure integration
- Implementing High-performance Complex Matrix Multiplication via the 3m and 4m Methods
- \textit{LPSE}: a 3-D wave-based model of cross-beam energy transfer in laser-irradiated plasmas
- Scientific computations on multi-core systems using different programming frameworks
- Towards an efficient use of the BLAS library for multilinear tensor contractions
- Load balance and parallel I/O: optimising COSA for large simulations
- Numerical benchmarking of fluid-rigid body interactions
- FFT, FMM, or multigrid? A comparative study of state-of-the-art Poisson solvers for uniform and nonuniform grids in the unit cube
- BLIS: a framework for rapidly instantiating BLAS functionality
- An approach for large-scale gyroscopic eigenvalue problems with application to high-frequency response of rolling tires
- A high order discontinuous Galerkin-Fourier incompressible 3D Navier-Stokes solver with rotating sliding meshes
- A computational investigation of a model of single-crystal gradient thermoplasticity that accounts for the stored energy of cold work and thermal annealing
- Toward a high performance tile divide and conquer algorithm for the dense symmetric eigenvalue problem
- Aerodynamic force evaluation for ice shedding phenomenon using vortex in cell scheme, penalisation and level set approaches
- Solving random ordinary differential equations on GPU clusters using multiple levels of parallelism
- Using Nesterov's method to accelerate multibody dynamics with friction and contact
- Adaptive FETI-DP and BDDC methods with a generalized transformation of basis for heterogeneous problems
- vibro -Lanczos, a symmetric Lanczos solver for vibro-acoustic simulations
- Partitioning and reordering for spike-based distributed-memory parallel Gauss-Seidel
- An efficient hybrid tridiagonal divide-and-conquer algorithm on distributed memory architectures
- Algorithms for efficient reproducible floating point summation
- Efficient algorithm for proper orthogonal decomposition of block-structured adaptively refined numerical simulations
- Iterative representing set selection for nested cross approximation.
- Vector and multithread computation of silencer performance prediction on a dual-processor PC workstation
- On the usage of tetrahedral background cells in nodal integration of RPIM for 3D elasto-static problems
- An immersed \(CR\)-\(P_0\) element for Stokes interface problems and the optimal convergence analysis
- Algorithm 1026: Concurrent Alternating Least Squares for Multiple Simultaneous Canonical Polyadic Decompositions
- Design of a high-performance GEMM-like tensor-tensor multiplication
- A parallel computing method using blocked format with optimal partitioning for SpMV on GPU
- An ellipsoidal bounding scheme for the quasi-clique number of a graph
- Robust viscous-inviscid interaction scheme for application on unstructured meshes
- Mathematical substantiation of pulsed electromagnetic soundings for new problems of petroleum geophysics
- Quantum circuits synthesis using Householder transformations
- The BLAS API of BLASFEO: optimizing performance for small matrices
- On BLAS Level-3 Implementations of Common Solvers for (Quasi-) Triangular Generalized Lyapunov Equations
- Improved convex and concave relaxations of composite bilinear forms
- Numerical methods and parallel algorithms for computation of periodic responses of plates
- A decomposition method with minimum communication amount for parallelization of multi-dimensional FFTs
- Performance models and workload distribution algorithms for optimizing a hybrid CPU-GPU multifrontal solver
- Stress-aware large-scale mesh editing using a domain-decomposed multigrid solver
- Improved hyper-reduction approach for the forced vibration analysis of rotating components
- Numerical solution of 3D exterior unsteady wave propagation problems using boundary operators
- Two-stage least squares and indirect least squares algorithms for simultaneous equations models
- A dissection solver with kernel detection for symmetric finite element matrices on shared memory computers
- An interior penalty stabilised incompressible discontinuous Galerkin-Fourier solver for implicit large eddy simulations
- Regularized symmetric positive definite matrix factorizations for linear systems arising from RBF interpolation and differentiation
- Accelerated dimension-independent adaptive metropolis
- Performance of the low-rank TT-SVD for large dense tensors on modern multicore CPUs
- Domain decomposition methods for 3D crack propagation problems using XFEM
- Iterative solver for systems of linear equations with a sparse stiffness matrix for clusters
- PARFES: A method for solving finite element linear equations on multi-core computers
- ATLAS
- BCYCLIC
- FLAME
- LAPACK
- PARDISO
- SPIRAL
- CartaBlanca
- McCormick
- SDPHA
- TOPOS
- SPIKE
- BLAS
- CUDA
- SCASY
- AIM@SHAPE
- EinSum
- Algorithm 844
- Algorithm 830
- TINKER
- SUMMA
- ARMCI
- SuiteSparse
- PHiPAC
- Eigen
- Armadillo
- BLIS
- geoRglm
- ViennaCL
- libflame
- RScaLAPACK
- CUBLAS
- MCSTL
This page was built for software: MKL