Rewind, repair, replay: three R's to dependability

  • Authors:
  • Aaron B. Brown;David A. Patterson

  • Affiliations:
  • University of California at Berkeley, Berkeley, CA;University of California at Berkeley, Berkeley, CA

  • Venue:
  • EW 10 Proceedings of the 10th workshop on ACM SIGOPS European workshop
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

Motivated by the growth of web and infrastructure services and their susceptibility to human operator-related failures, we introduce system-level undo as a recovery mechanism designed to improve service dependability. Undo enables system operators to recover from their inevitable mistakes and furthermore enables retroactive repair of problems that were not fixed quickly enough to prevent detrimental effects. We present the "three R's", a model of undo that matches the needs of human error recovery and retroactive repair; discuss several of the issues raised by this undo model; and introduce an initial architectural framework for undoable systems using the example of an undoable e-mail service system.