Communication-Induced Determination of Consistent Snapshots
IEEE Transactions on Parallel and Distributed Systems
Hi-index | 0.00 |
Abstract: A distributed coordinated checkpointing algorithm is shown. A consistent global checkpoint is a set of states in which no message is recorded as received in one process and as not yet sent in another process. This algorithm obtains a consistent global checkpoint for any checkpoint initiation by any process. Under Chandy and Lamport's assumption that one consistent global checkpoint is obtained for a set of concurrent checkpoint initiations, the total number of checkpoints is minimized. This paper then modifies the assumption in order to reduce the number of checkpoints further.