Multitolerance in Distributed Reset

  • Authors:
  • Sandeep S. Kulkarni;Anish Arora

  • Affiliations:
  • -;-

  • Venue:
  • Multitolerance in Distributed Reset
  • Year:
  • 1998

Quantified Score

Hi-index 0.00

Visualization

Abstract

A reset of a distributed system is safe if it does not complete ``prematurely,'''' i.e., without having reset some process in the system. Safe resets are possible in the presence of certain faults, such as process fail-stops and repairs, but are not always possible in the presence of more general faults, such as arbitrary transients. In this paper, we design a bounded-memory distributed-reset program that possesses two tolerances: (1) in the presence of fail-stops and repairs, it always executes resets safely, and (2) in the presence of a finite number of transient faults, it eventually executes resets safely. Designing this multitolerance in the reset program introduces the novel concern of designing a safety detector that is itself multitolerant. A broad application of our multitolerant safety detector is to make any total program likewise multitolerant.