Efficient parallel factorization and solution of structured and unstructured linear systems (Q2486566)

From MaRDI portal
scientific article
Language Label Description Also known as
English
Efficient parallel factorization and solution of structured and unstructured linear systems
scientific article

    Statements

    Efficient parallel factorization and solution of structured and unstructured linear systems (English)
    0 references
    5 August 2005
    0 references
    The paper provides efficient parallel algorithms for exactly factoring some classes of \(n\)-dimensional symmetric positive definite matrices (SPD) in time \(O(\text{log}^2\,\,n)\) with a near optimal number of processors. A parallel random access machine (PRAM) model of parallel computation with unit cost arithmetic operations including division over a finite field is supposed. Prior work did not generally require unit cost division over a finite field. The considered input matrices have entries that are rational numbers given as a ratio of integers with at most a polynomial number of bits \(\beta\). Only bit precision \(O(n(\beta +\text{log n}))\) is required. It is the asymptotically optimal bit precision for \(\beta\geq\text{log}\,\,n\) since the determinant, exact \(LU\) factorization, and matrix inverse require bit precision at least \(\Omega (n\beta)\). The algorithms are randomized. They give outputs within the stated bounds with high likelihood \(\geq 1-{1 \over n^{\Omega (1)}}\) using a constant number of random variables ranging over a domain of size \((n\| A\| )^{O(1)}\). Recursive factorization (RF) of SPD matrices is computed using the Newton's iteration, the Newton-Hensel lifting, and the variable diagonal technique. These techniques are extended into the multilevel pipelined framework using the generalization of the stream contraction method by \textit{V. Pan} and \textit{J. Reif} [Inf. Process. Lett. 40, 79--83 (1991; Zbl 0748.68025)]. \(LU\) and \(QR\) factorizations for dense matrices and \(LU\) factorizations for sparse matrices which are \(s(n)\)-separable are presented. They reduce the known parallel time bounds from \(\Omega (\log^3 n)\) to \(O(\text{log}^2n)\) without increase of processors. The algorithms are further specialized to structured matrices. \(LU\) factorizations for Toeplitz matrices and matrices of bounded displacement rank in time \(O(\text{log}^2\,\,n)\) using \(P(n)\) processors are developed. Here \(P(n)\) denotes the number of arithmetic processors used to multiply two polynomials of degree \(n\) in \(O(\text{log}\,n)\) parallel time. Thus the processor reduction from \(n^2\) to \((P(n)\) is achieved. These results are applied with the same parallel time and processor bound to solve the problems of polynomial resultant, Padé approximants of rational functions, and with a factor \(O(\text{log}\,n)\) more time polynomial greatest common divisors (GCD) and extended GCD.
    0 references
    parallel algorithms
    0 references
    linear systems
    0 references
    \(LU\) factorization
    0 references
    dense matrices
    0 references
    sparse matrices
    0 references
    Newton iteration
    0 references
    structured matrices
    0 references
    Toeplitz matrices
    0 references
    displacement rank
    0 references
    polynomial greatest common divisor
    0 references
    resultant
    0 references
    Padé approximation
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references

    Identifiers

    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references