Batched computation of the singular value decompositions of order two by the AVX-512 vectorization

From MaRDI portal
Publication:5087080

DOI10.1142/S0129626420500152zbMATH Open1490.65063arXiv2005.07403OpenAlexW3118611730MaRDI QIDQ5087080FDOQ5087080


Authors: Vedran Novaković Edit this on Wikidata


Publication date: 8 July 2022

Published in: Parallel Processing Letters (Search for Journal in Brave)

Abstract: In this paper a vectorized algorithm for simultaneously computing up to eight singular value decompositions (SVDs, each of the form A=USigmaVast) of real or complex matrices of order two is proposed. The algorithm extends to a batch of matrices of an arbitrary length n, that arises, for example, in the annihilation part of the parallel Kogbetliantz algorithm for the SVD of a square matrix of order 2n. The SVD algorithm for a single matrix of order two is derived first. It scales, in most instances error-free, the input matrix A such that its singular values Sigmaii cannot overflow whenever its elements are finite, and then computes the URV factorization of the scaled matrix, followed by the SVD of a non-negative upper-triangular middle factor. A vector-friendly data layout for the batch is then introduced, where the same-indexed elements of each of the input and the output matrices form vectors, and the algorithm's steps over such vectors are described. The vectorized approach is then shown to be about three times faster than processing each matrix in isolation, while slightly improving accuracy over the straightforward method for the 2imes2 SVD.


Full work available at URL: https://arxiv.org/abs/2005.07403




Recommendations




Cites Work


Cited In (2)

Uses Software





This page was built for publication: Batched computation of the singular value decompositions of order two by the AVX-512 vectorization

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5087080)