GPU acceleration of splitting schemes applied to differential matrix equations (Q2287871)

From MaRDI portal
Revision as of 00:02, 20 March 2024 by Openalex240319060354 (talk | contribs) (Set OpenAlex properties.)
scientific article
Language Label Description Also known as
English
GPU acceleration of splitting schemes applied to differential matrix equations
scientific article

    Statements

    GPU acceleration of splitting schemes applied to differential matrix equations (English)
    0 references
    0 references
    0 references
    22 January 2020
    0 references
    The present work discusses a a parallel GPU implementation of splitting schemes for matrix differential Lyapunov and Riccati type-equations of the general form \(\dot{P} = A^{\mathsf{T}}P + PA + Q + G(P)\). A comparison of different variants of splitting schemes (Lie and Strang splitting) based on Leja point interpolation for the computation of matrix exponential actions is provided. In this schemes, the original problem is divided into simpler sub-problems \(\dot{P} = F_1(P)\) and \(\dot{P} = F_2(P)\), which are then solved separately, in sequential fashion. To enable an efficient computation for a large-scale setting, it is assumed that \(P\) exhibits low-rank structure. The considered methods are implemented in MATLAB, exploiting its built-in GPU support via NVIDIA's CUDA library. The present work only considers the autonomous case; consequently, one can avoid costly re-computations that would arise for time-dependence. Four different numerical examples are considered for testing the proposed methodology. For two of these examples, experimental convergence results against the solutions obtained by the Matlab routine \texttt{ode15s} are reported. For the remaining two examples, the problem size is increased, and self-convergence results are provided (i.e., the convergence against a solution computed with the same method at a finer resolution level). The work concludes with a performance analysis by comparing the GPU implementation against the CPU version. The speedup is between \(3\times\) and \(10\times\) (depending on the matrix size).
    0 references
    0 references
    differential Lyapunov equations
    0 references
    differential Riccati equations
    0 references
    large scale
    0 references
    splitting schemes
    0 references
    GPU acceleration
    0 references
    0 references
    0 references

    Identifiers