Definition and Specification of Accrual Failure Detectors

  • Authors:
  • Affiliations:
  • Venue:
  • DSN '05 Proceedings of the 2005 International Conference on Dependable Systems and Networks
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

For many years, people have been advocating the development of failure detection as a basic service, but, unfortunately, without meeting much success so far. We believe that this comes from the fact that important system engineering issues have not yet been addressed adequately, thus preventing the definition of a truly generic service. Ultimately, our goal is to define a service that is both simple and expressive, yet powerful enough to support the requirements of many distributed applications. To this end, we consider an alternative interaction model between the service and the applications, called accrual failure detectors. Roughly, an accrual failure detector associates to each process a real value representing a suspicion level, instead of the traditional binary information (i.e., trust vs. suspect). In this paper, we provide a rigorous definition for accrual failure detectors, demonstrate that changing the interaction model leads to no loss in computational power, discuss quality of service issues, and present several possible implementations.