Enhancing The Fault-Tolerance of Nonmasking Programs

  • Authors:
  • Sandeep S. Kulkarni;Ali Ebnenasir

  • Affiliations:
  • -;-

  • Venue:
  • ICDCS '03 Proceedings of the 23rd International Conference on Distributed Computing Systems
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we focus on automated techniques to enhance thefault-tolerance of a nonmasking fault-tolerant program to masking.A masking program continually satisfies its specification even iffaults occur. By contrast, a nonmasking program merely guaranteesthat after faults stop occurring, the program recovers to states fromwhere it continually satisfies its specification. Until the recovery iscomplete, however, a nonmasking program can violate its (safety)specification. Thus, the problem of enhancing fault-tolerance fromnonmasking to masking requires that safety be added and recoverybe preserved. We focus on this enhancement problem for high atomicityprograms -where each process can read all variables- and fordistributed programs -where restrictions are imposed on what processescan read and write. We present a sound and complete algorithmfor high atomicity programs and a sound algorithm for distributedprograms. We also argue that our algorithms are simplerthan previous algorithms, where masking fault-tolerance is addedto a fault-intolerant program. Hence, these algorithms can partiallyreap the benefits of automation when the cost of adding maskingfault-tolerance to a fault-intolerant program is high. To illustratethese algorithms, we show how the masking fault-tolerant programsfor triple modular redundancy and Byzantine agreement can be obtainedby enhancing the fault-tolerance of the corresponding non-maskingversions. We also discuss how the derivation of these programsis simplified when we begin with a nonmasking fault-tolerantprogram.