On the efficient implementation of preconditioned s-step conjugate gradient methods on multiprocessors with memory hierarchy (Q1822895)

From MaRDI portal
Revision as of 00:15, 15 February 2024 by RedirectionBot (talk | contribs) (‎Changed an Item)
scientific article
Language Label Description Also known as
English
On the efficient implementation of preconditioned s-step conjugate gradient methods on multiprocessors with memory hierarchy
scientific article

    Statements

    On the efficient implementation of preconditioned s-step conjugate gradient methods on multiprocessors with memory hierarchy (English)
    0 references
    1989
    0 references
    The use of the preconditioned conjugate gradient (CG) method in approximating the solution of a sparse symmetric and positive definite linear system of equations \(Ax=f\) is considered. The main goal of the work is to prove that the preconditioned s-CG method can be efficiently implemented so that only one global memory sweep for s-CG iterations is needed. Polynomial and incomplete Cholesky preconditionings are used. The preconditioned CG algorithm consists in applying the CG method to the transformed system \((K^{1/2}AK^{1/2})K^{-1/2}x=K^{1/2}f,\) where K is the preconditioning matrix and is selected as an approximation to the inverse of A. The s-step CG generalizes the CG method and has improved data locality and parallel properties. It can be organized so that only one sweep through the data is needed to complete one step of s-CG which is equivalent to s successive steps of CG. Two manners to choose the matrix K in order to produce efficient theoretical implementation of the preconditioned s-CG are presented. For the incomplete Cholesky preconditioning a completely parallelizable form of the preconditioner is obtained. The authors describe the manner in which different parts of s-CG can be implemented efficiently on vector processor with local memory having a register-to-register or memory-to- memory organization. Some experimental results obtained on the ALLIANT FX/8 multiprocessor system are presented. For some linear systems obtained from the discretization of elliptic PDEs the obtained computational results show that on a multiprocessor system with memory hierarchy the 5-step preconditioned CG is faster than the standard preconditioned CG.
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    s-step conjugate gradient methods
    0 references
    parallel algorithms
    0 references
    numerical examples
    0 references
    preconditioned conjugate gradient method
    0 references
    incomplete Cholesky preconditionings
    0 references
    multiprocessor system
    0 references