CatchAndRetry: extending exceptions to handle distributed system failures and recovery

  • Authors:
  • Emre Kiciman;Benjamin Livshits;Madanlal Musuvathi

  • Affiliations:
  • Microsoft Research;Microsoft Research;Microsoft Research

  • Venue:
  • Proceedings of the Fifth Workshop on Programming Languages and Operating Systems
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we present CatchAndRetry, an extension of the traditional exception mechanism to provide language-level support for common recovery techniques in distributed systems. We motivate and justify our design by analyzing several cases studies taken from the context of Facebook. CatchAndRetry is a language mechanism that is general enough to apply to multiple tiers of a distributed application; throughout this paper, we illustrate CatchAndRetry with examples of its use within both a large-scale distributed server-side application running in a data center as well as a JavaScript clients-side application running within a web browser.