Communication-optimal parallel and sequential Cholesky decomposition

DOI10.1137/090760969zbMATH Open1238.65018arXiv0902.2537OpenAlexW3103849684MaRDI QIDQ5200261FDOQ5200261

Authors: Grey Ballard, Oded Schwartz, James Demmel, Olga Holtz

Publication date: 1 August 2011

Published in: SIAM Journal on Scientific Computing (Search for Journal in Brave)

Abstract: Numerical algorithms have two kinds of costs: arithmetic and communication, by which we mean either moving data between levels of a memory hierarchy (in the sequential case) or over a network connecting processors (in the parallel case). Communication costs often dominate arithmetic costs, so it is of interest to design algorithms minimizing communication. In this paper we first extend known lower bounds on the communication cost (both for bandwidth and for latency) of conventional (O(n^3)) matrix multiplication to Cholesky factorization, which is used for solving dense symmetric positive definite linear systems. Second, we compare the costs of various Cholesky decomposition implementations to these lower bounds and identify the algorithms and data structures that attain them. In the sequential case, we consider both the two-level and hierarchical memory models. Combined with prior results in [13, 14, 15], this gives a set of communication-optimal algorithms for O(n^3) implementations of the three basic factorizations of dense linear algebra: LU with pivoting, QR and Cholesky. But it goes beyond this prior work on sequential LU by optimizing communication for any number of levels of memory hierarchy.

Full work available at URL: https://arxiv.org/abs/0902.2537

Recommendations

zbMATH Keywords

Cholesky decomposition parallel computation matrix multiplication LU factorization QR factorization latency communication costs bandwidth lower bound communication avoiding algorithm

Mathematics Subject Classification ID

Direct numerical methods for linear systems and matrix inversion (65F05) Parallel numerical computation (65Y05) Complexity and performance of numerical algorithms (65Y20)

Cited In (11)

Uses Software

This page was built for publication: Communication-optimal parallel and sequential Cholesky decomposition

Report a bug (only for logged in users!)Click here to report a bug for this page (MaRDI item Q5200261)