A Fundamental Failure Model for Fault-Tolerant Protocols

  • Authors:
  • Klaus Echtle;Asif Masum

  • Affiliations:
  • -;-

  • Venue:
  • IPDS '00 Proceedings of the 4th International Computer Performance and Dependability Symposium
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

The application area of distributed systems determines the extent to which protocols must provide fault detection and/or fault tolerance. Highest dependability cannot be obtained without the cost of a substantial overhead. In order to reduce the message number and the time consumption, protocols should be tailored best to application requirements and system properties.This paper presents a novel failure classification as an instrument to limit fault detection and tolerance features to a reasonable failure set. Evaluation of protocols shows that just exclusion of "exotic" failures, which are most unlikely to occur, enable a drastic increase in efficiency. Unlike other approaches, our failure classification is based on a completely functional model and on the definition of so-called failure capabilities. This overcomes the limitations of strictly hierarchic and time/value-based models. The new approach provides a framework to precisely specify common failure assumptions as well as very specialized scenarios - in particular so-called non-cooperative Byzantine failures.