Checkpointing and Rollback-Recovery for Distributed Systems

From MaRDI portal
Publication:3740209

DOI10.1109/TSE.1987.232562zbMath0603.68018MaRDI QIDQ3740209

Richard Koo, Sam Toueg

Publication date: 1987

Published in: IEEE Transactions on Software Engineering (Search for Journal in Brave)




Related Items (26)

Consistent global checkpoints based on direct dependency trackingA compositional framework for fault tolerance by specification transformationEvaluations of domino-free communication-induced checkpointing protocolsResolving error propagation in distributed systemsOptimal checkpointing interval of a communication system with rollback recoveryAn efficient backup warning policy for a hard diskFNB: fast non-blocking coordinated checkpointing protocol for distributed systemsAn optimistic checkpointing and message logging approach for consistent global checkpoint collection in distributed systemsGarbage collection in uncoordinated checkpointing algorithmsEfficient algorithms for optimistic crash recoveryThe inhibition spectrum and the achievement of causal consistencyCommunication-based prevention of useless checkpoints in distributed computationsAn efficient approach for constructing reliable distributed applicationsAn optimality proof for asynchronous recovery algorithms in distributed systemsConcurrent common knowledge: Defining agreement for asynchronous systemsTransformation of programs for fault-toleranceSecond-level algorithms, superrecursivity, and recovery problem in distributed systemsOn the no-Z-cycle property in distributed executionsAdaptive checkpointing in message passing distributed systemsOptimised Recovery with a Coordinated Checkpoint/Rollback Protocol for Domain Decomposition ApplicationsGUARANTEED MUTUALLY CONSISTENT CHECKPOINTING IN DISTRIBUTED COMPUTATIONSCheckpointing with mutable checkpoints.Rollback-dependency trackability: A minimal characterization and its protocolA distributed error recovery technique and its implementation and application on UNIXVirus tests to maximize availability of software systemsInterval consistency of asynchronous distributed computations




This page was built for publication: Checkpointing and Rollback-Recovery for Distributed Systems