Evaluating Distributed Checkpointing Protocol

  • Authors:
  • Adnan Agbaria;Ari Freund;Roy Friedman

  • Affiliations:
  • -;-;-

  • Venue:
  • ICDCS '03 Proceedings of the 23rd International Conference on Distributed Computing Systems
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents an objective measure, called overheadratio, for evaluating distributed checkpointing protocols.This measure extends previous evaluation schemes byincorporating several additional parameters that are inherentin distributed environments. In particular, we take intoaccount the rollback propagation of the protocol, which impactsthe length of the recovery process, and therefore theexpected program run-time in executions that involve failuresand recoveries. The paper also analyzes several knownprotocols and compares their overhead ratio.