Toward common patterns for distributed, concurrent, fault-tolerant code

  • Authors:
  • Ryan Stutsman;John Ousterhout

  • Affiliations:
  • Stanford University;Stanford University

  • Venue:
  • HotOS'13 Proceedings of the 14th USENIX conference on Hot Topics in Operating Systems
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

There are no widely accepted design patterns for writing distributed, concurrent, fault-tolerant code. Each programmer develops her own techniques for writing this type of complex software. The use of a common pattern for fault-tolerant programming has the potential to produce correct code more quickly and increase shared understanding between developers. We describe rules, tasks, and pools, patterns extracted from the development of RAMCloud, a fault-tolerant datacenter storage system. We illustrate their application and discuss their relationship to concurrent programming models. Our goal is to generate discussion that will ultimately lead to common techniques for fault-tolerant programming.