The Impact of Recovery Mechanisms on the Likelihood of Saving Corrupted State

  • Authors:
  • Subhachandra Chandra;Peter M. Chen

  • Affiliations:
  • -;-

  • Venue:
  • ISSRE '02 Proceedings of the 13th International Symposium on Software Reliability Engineering
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

Recovery systems must save state before a failure occursto enable the system to recover from the failure. However,recovery will fail if the recovery system saves any statecorrupted by the fault. The frequency and comprehensive-nessof how a recovery system saves state has a majoreffect on how often the recovery system inadvertentlysaves corrupted state. This paper explores and measuresthat effect. We measure how often software faults in theapplication and operating system cause real applicationsto save corrupted state when using different types of recov-erysystems. We find that generic recovery techniques, suchas checkpointing and logging, work well for faults in theoperating system. However, we find that they do not workwell for faults in the application because the very actionstaken to enable recovery often corrupt the state uponwhich successful recovery depends.