Self-healing multitier architectures using cascading rescue points

  • Authors:
  • Angeliki Zavou;Georgios Portokalidis;Angelos D. Keromytis

  • Affiliations:
  • Columbia University, New York, NY;Columbia University, New York, NY;Columbia University, New York, NY

  • Venue:
  • Proceedings of the 28th Annual Computer Security Applications Conference
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Software bugs and vulnerabilities cause serious problems to both home users and the Internet infrastructure, limiting the availability of Internet services, causing loss of data, and reducing system integrity. Software self-healing using rescue points (RPs) is a known mechanism for recovering from unforeseen errors. However, applying it on multitier architectures can be problematic because certain actions, like transmitting data over the network, cannot be undone. We propose cascading rescue points (CRPs) to address the state inconsistency issues that can arise when using traditional RPs to recover from errors in interconnected applications. With CRPs, when an application executing within a RP transmits data, the remote peer is notified to also perform a checkpoint, so the communicating entities checkpoint in a coordinated, but loosely coupled way. Notifications are also sent when RPs successfully complete execution, and when recovery is initiated, so that the appropriate action is performed by remote parties. We developed a tool that implements CRPs by dynamically instrumenting binaries and transparently injecting notifications in the already established TCP channels between applications. We tested our tool with various applications, including the MySQL and Apache servers, and show that it allows them to successfully recover from errors, while incurring moderate overhead between 4.54% and 71.56%.