Fail-Stutter Fault Tolerance

  • Authors:
  • Remzi H. Arpaci-Dusseau;Andrea C. Arpaci-Dusseau

  • Affiliations:
  • -;-

  • Venue:
  • HOTOS '01 Proceedings of the Eighth Workshop on Hot Topics in Operating Systems
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

Abstract: Traditional fault models present system designers with two extremes: the Byzantine fault model, which is general and therefore difficult to apply, and the fail-stop fault model, which is easier to employ but does not accurately capture modern device behavior. To address this gap, we introduce the concept of fail-stutter fault tolerance, a realistic and yet tractable fault model that accounts for both absolute failure and a new range of performance failures common in modern components. Systems built under the fail-stutter model will likely perform well, be highly reliable and available, and be easier to manage when deployed.