Treating bugs as allergies: a safe method for surviving software failures

  • Authors:
  • Feng Qin;Joseph Tucek;Yuanyuan Zhou

  • Affiliations:
  • Department of Computer Science, University of Illinois at Urbana-Champaign;Department of Computer Science, University of Illinois at Urbana-Champaign;Department of Computer Science, University of Illinois at Urbana-Champaign

  • Venue:
  • HOTOS'05 Proceedings of the 10th conference on Hot Topics in Operating Systems - Volume 10
  • Year:
  • 2005

Quantified Score

Hi-index 0.01

Visualization

Abstract

Many applications demand availability. Unfortunately, software failures greatly reduce system availability. Previous approaches for surviving software failures suffer from several limitations, including requiring application restructuring, failing to address deterministic software bugs, unsafely speculating on program execution, and requiring a long recovery time. This paper proposes an innovative, safe technique, called Rx, that can quickly recover programs from many types of common software bugs, both deterministic and non-deterministic. Our idea, inspired by allergy treatment in real life, is to rollback the program to a recent checkpoint upon a software failure, and then to reexecute the program in a modified environment. We base this idea on the observation that many bugs are correlated with the execution environment, and therefore can be avoided by removing the "allergen" from the environment. Rx requires few to no modifications to applications and provides programmers with additional feedback for bug diagnosis.