Definition and Specification of Accrual Failure Detectors

Authors:
Affiliations:
Venue:
DSN '05 Proceedings of the 2005 International Conference on Dependable Systems and Networks
Year:
2005

Citing 0
Cited 6

Latency and bandwidth-minimizing failure detectors

Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007
Design of the notification system for failure detectors

International Journal of High Performance Computing and Networking
Comparative analysis of quality of service and memory usage for adaptive failure detectors in healthcare systems

IEEE Journal on Selected Areas in Communications - Special issue on wireless and pervasive communications for healthcare
Failure-aware resource management for high-availability computing clusters with distributed virtual machines

Journal of Parallel and Distributed Computing
FaDe: RESTful service for failure detection in SOA environment

PaCT'11 Proceedings of the 11th international conference on Parallel computing technologies
Failure detection in a RESTful way

PPAM'11 Proceedings of the 9th international conference on Parallel Processing and Applied Mathematics - Volume Part II

Quantified Score

Hi-index	0.00

Visualization

Abstract

For many years, people have been advocating the development of failure detection as a basic service, but, unfortunately, without meeting much success so far. We believe that this comes from the fact that important system engineering issues have not yet been addressed adequately, thus preventing the definition of a truly generic service. Ultimately, our goal is to define a service that is both simple and expressive, yet powerful enough to support the requirements of many distributed applications. To this end, we consider an alternative interaction model between the service and the applications, called accrual failure detectors. Roughly, an accrual failure detector associates to each process a real value representing a suspicion level, instead of the traditional binary information (i.e., trust vs. suspect). In this paper, we provide a rigorous definition for accrual failure detectors, demonstrate that changing the interaction model leads to no loss in computational power, discuss quality of service issues, and present several possible implementations.