Consistent Logical Checkpointing

  • Authors:
  • Nitin H. Vaidya

  • Affiliations:
  • -

  • Venue:
  • Consistent Logical Checkpointing
  • Year:
  • 1994

Quantified Score

Hi-index 0.01

Visualization

Abstract

A "consistent checkpointing" algorithm saves a consistent view of the distributed system state on stable storage. The loss of computation upon a failure can be bounded by taking consistent checkpoints with adequate frequency. The traditional consistent checkpointing algorithms require the different processes to save their state at about the same time. This causes contention for the stable storage, potentially resulting in large overheads. Staggering the checkpoints taken by various processes can reduce the overhead. Some techniques for staggering the checkpoints have been proposed previously. However, these techniques result in "limited staggering" in that not all processes'' checkpoints can be staggered. Ideally, one would like to stagger the checkpoints arbitrarily. This report presents a simple approach to arbitrarily stagger the checkpoints. Our approach requires that the processes take consistent logical checkpoints, as compared to consistent physical checkpoints enforced by existing algorithms. This report discusses the proposed approach and the implementation issues. The proposed approach was discussed briefly in [vaidya94tech44]. The proposed algorithm is currently being implemented. The experimental results will be included in a future revision of this report.