Rejuvenation and Failure Detection in Partitionable Systems

  • Authors:
  • Christof Fetzer;Karin Högstedt

  • Affiliations:
  • -;-

  • Venue:
  • PRDC '01 Proceedings of the 2001 Pacific Rim International Symposium on Dependable Computing
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

Certain gateways (e.g., some cable or DSL modems)are known to have low reliability and low availability.Most failures of these devices can however be "fixed"by rejuvenating the device after a failure has been detected.Such a detection based rejuvenation strategypermits increasing the availability of these gateways.In the considered scenario, rejuvenation is non-trivialsince a failure of such a gateway will leave it partitionedaway from the network.In particular, networkoperators that want to rejuvenate these gateways arein a different network partition, and can therefore notinitiate a remote rejuvenation.In this paper we propose a failure detection based rejuvenation service and a remote detection service. Therejuvenation service detects and faxes "soft" failures automatically (in one partition), and the detection servicedetects (in another partition) all rejuvenations exactlyonce, within a bounded amount of time, even whenthe gateway is rejuvenated consecutively.The detectionservice also allows the detection of "hard" failures, andfiltering of notifications of soft failures.