A global checkpointing model for error recovery

  • Authors:
  • Krishna Kant

  • Affiliations:
  • Northwestern University, Evanston, Illinois

  • Venue:
  • AFIPS '83 Proceedings of the May 16-19, 1983, national computer conference
  • Year:
  • 1983

Quantified Score

Hi-index 0.00

Visualization

Abstract

The paper proposes a new concept for providing software fault tolerance in concurrent systems. It combines the traditional global-checkpointing mechanism with the recovery-block concept in order to come up with an easily implementable errorrecovery mechanism. This mechanism involves smaller overhead in case of moderate to high process interaction than the schemes considered in the past, which are based upon the idea of local checkpointing. A model for computing the optimum checkpointing interval is also presented. A particular distribution is hypothesized for the coverage of the recovery and the behavior of the model studied in detail for this case.