BitVault: a highly reliable distributed data retention platform

  • Authors:
  • Zheng Zhang;Qiao Lian;Shiding Lin;Wei Chen;Yu Chen;Chao Jin

  • Affiliations:
  • Microsoft Research, Asia;Microsoft Research, Asia;Microsoft Research, Asia;Microsoft Research, Asia;Microsoft Research, Asia;Microsoft Research, Asia

  • Venue:
  • ACM SIGOPS Operating Systems Review - Systems work at Microsoft Research
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper summarizes our experience designing and implementing BitVault: a content-addressable retention platform for large volumes of reference data -- seldom-changing information that needs to be retained for a long time. BitVault uses "smart bricks" as the building block to lower the hardware cost. The challenges are to keep management costs low in a system that scales from one brick to tens of thousands, to ensure reliability, and to deliver a simple design. Our design incorporates peer-to-peer (P2P) technologies for self-managing and self-healing and uses massively parallel repair to reduce system vulnerability to data loss. The simplicity of the architecture relies on an eventually reliable membership service provided by a perfect one-hop distributed hash table (DHT). Its object-driven repair model yields last-replica recall guarantee independent of the failure scenario. So long as the last copy of a data object remains in the system, that data can be retrieved and its replication degree can be restored. A prototype has been implemented. Theoretical analysis, simulations and experiments have been conducted to validate the design of BitVault.