Theoretical Analysis for Communication-Induced Checkpointing Protocols with Rollback-Dependency Trackability

  • Authors:
  • Jichiang Tsai;Sy-Yen Kuo;Yi-Min Wang

  • Affiliations:
  • National Taiwan Univ., Taipei, Taiwan, R.O.C.;National Taiwan Univ., Taipei, Taiwan, R.O.C.;Microsoft Corp., Redmond, WA

  • Venue:
  • IEEE Transactions on Parallel and Distributed Systems
  • Year:
  • 1998

Quantified Score

Hi-index 0.00

Visualization

Abstract

Rollback-Dependency Trackability (RDT) is a property that states that all rollback dependencies between local checkpoints are on-line trackable by using a transitive dependency vector. In this paper, we address three fundamental issues in the design of communication-induced checkpointing protocols that ensure RDT. First, we prove that the following intuition commonly assumed in the literature is in fact false: If a protocol forces a checkpoint only at a stronger condition, then it must take, at most, as many forced checkpoints as a protocol based on a weaker condition. This result implies that the common approach of sharpening the checkpoint-inducing condition by piggybacking more control information on each message may not always yield a more efficient protocol. Next, we prove that there is no optimal on-line RDT protocol that takes fewer forced checkpoints than any other RDT protocol for all possible communication patterns. Finally, since comparing checkpoint-inducing conditions is not sufficient for comparing protocol performance, we present some formal techniques for comparing the performance of several existing RDT protocols.