A minimally intrusive low-memory approach to resilience for existing transient solvers (Q1736916)

From MaRDI portal
scientific article
Language Label Description Also known as
English
A minimally intrusive low-memory approach to resilience for existing transient solvers
scientific article

    Statements

    A minimally intrusive low-memory approach to resilience for existing transient solvers (English)
    0 references
    0 references
    0 references
    26 March 2019
    0 references
    A novel, minimally intrusive approach to adding fault tolerance to existing complex scientific simulation codes is introduced. The approach in this paper combines the proposed user-level failure mitigation extensions to the Message-Passing Interface (MPI), with the concepts of message-logging and remote inmemory checkpointing. A prototype implementation is applied to Nektar++.
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    0 references
    exascale
    0 references
    fault tolerance
    0 references
    message-logging
    0 references
    MPI
    0 references
    transient solvers
    0 references
    parallel computing
    0 references
    0 references
    0 references
    0 references
    0 references