GPU acceleration of splitting schemes applied to differential matrix equations (Q2287871): Difference between revisions
From MaRDI portal
Changed an Item |
Set profile property. |
||
Property / MaRDI profile type | |||
Property / MaRDI profile type: MaRDI publication profile / rank | |||
Normal rank |
Revision as of 06:35, 5 March 2024
scientific article
Language | Label | Description | Also known as |
---|---|---|---|
English | GPU acceleration of splitting schemes applied to differential matrix equations |
scientific article |
Statements
GPU acceleration of splitting schemes applied to differential matrix equations (English)
0 references
22 January 2020
0 references
The present work discusses a a parallel GPU implementation of splitting schemes for matrix differential Lyapunov and Riccati type-equations of the general form \(\dot{P} = A^{\mathsf{T}}P + PA + Q + G(P)\). A comparison of different variants of splitting schemes (Lie and Strang splitting) based on Leja point interpolation for the computation of matrix exponential actions is provided. In this schemes, the original problem is divided into simpler sub-problems \(\dot{P} = F_1(P)\) and \(\dot{P} = F_2(P)\), which are then solved separately, in sequential fashion. To enable an efficient computation for a large-scale setting, it is assumed that \(P\) exhibits low-rank structure. The considered methods are implemented in MATLAB, exploiting its built-in GPU support via NVIDIA's CUDA library. The present work only considers the autonomous case; consequently, one can avoid costly re-computations that would arise for time-dependence. Four different numerical examples are considered for testing the proposed methodology. For two of these examples, experimental convergence results against the solutions obtained by the Matlab routine \texttt{ode15s} are reported. For the remaining two examples, the problem size is increased, and self-convergence results are provided (i.e., the convergence against a solution computed with the same method at a finer resolution level). The work concludes with a performance analysis by comparing the GPU implementation against the CPU version. The speedup is between \(3\times\) and \(10\times\) (depending on the matrix size).
0 references
differential Lyapunov equations
0 references
differential Riccati equations
0 references
large scale
0 references
splitting schemes
0 references
GPU acceleration
0 references