Toward automatic policy refinement in repair services for large distributed systems

  • Authors:
  • Moises Goldszmidt;Mihai Budiu;Yue Zhang;Michael Pechuk

  • Affiliations:
  • Microsoft Research;Microsoft Research;Microsoft Windows Azure;Microsoft Windows Azure

  • Venue:
  • ACM SIGOPS Operating Systems Review
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

In order to be economically feasible and to offer high levels of availability and performance, large scale distributed systems depend on the automation of repair services. While there has been considerable work on mechanisms for such automated services, a framework for evaluating and optimizing the policies governing such mechanisms has been lacking. In this paper we propose one such framework and report on our initial experience in applying the framework to analyze and optimize the operation a geo-distributed cloud storage system at Microsoft.