Optimal inter-object correlation when replicating for availability

  • Authors:
  • Haifeng Yu;Phillip B. Gibbons

  • Affiliations:
  • National University of Singapore;Intel Research Pittsburgh

  • Venue:
  • Proceedings of the twenty-sixth annual ACM symposium on Principles of distributed computing
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Data replication is a key technique for ensuring data availability. Traditionally, researchers have focused on the availability of individual objects, even though user-level tasks (called operations) typically request multiple objects. Our recent experimental study has shown that the assignment of object replicas to machines results in subtle yet dramatic effects on the availability of these operations, even though the availability of individual objects remains the same. This paper is the first to approach the assignment problem from a theoretical perspective, and obtains a series of results regarding assignments that provide the best and the worst availability for user-level operations. We use a range of techniques to obtain our results, from standard combinatorial techniques and hill climbing methods to Janson's inequality (a strong probabilistic tool). Some of the results demonstrate that even quite simple versions of the assignment problem can have surprising answers.