BitVault: a highly reliable distributed data retention platform

Authors:
Zheng Zhang;Qiao Lian;Shiding Lin;Wei Chen;Yu Chen;Chao Jin
Affiliations:
Microsoft Research, Asia;Microsoft Research, Asia;Microsoft Research, Asia;Microsoft Research, Asia;Microsoft Research, Asia;Microsoft Research, Asia
Venue:
ACM SIGOPS Operating Systems Review - Systems work at Microsoft Research
Year:
2007

Citing 34
Cited 3

Serverless network file systems

SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
Petal: distributed virtual disks

Proceedings of the seventh international conference on Architectural support for programming languages and operating systems
Frangipani: a scalable distributed file system

Proceedings of the sixteenth ACM symposium on Operating systems principles
A cost-effective, high-bandwidth storage architecture

Proceedings of the eighth international conference on Architectural support for programming languages and operating systems
OceanStore: an architecture for global-scale persistent storage

ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Chord: A scalable peer-to-peer lookup service for internet applications

Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
A scalable content-addressable network

Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
Storage management and caching in PAST, a large-scale, persistent peer-to-peer storage utility

SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
Wide-area cooperative storage with CFS

SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
Group communication specifications: a comprehensive study

ACM Computing Surveys (CSUR)
Reliable Distributed Computing with the ISIS Toolkit

Reliable Distributed Computing with the ISIS Toolkit
Venti: A New Approach to Archival Storage

FAST '02 Proceedings of the Conference on File and Storage Technologies
Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems

Middleware '01 Proceedings of the IFIP/ACM International Conference on Distributed Systems Platforms Heidelberg
SWIM: Scalable Weakly-consistent Infection-style Process Group Membership Protocol

DSN '02 Proceedings of the 2002 International Conference on Dependable Systems and Networks
Managable Storage via Adaptation in WiND

CCGRID '01 Proceedings of the 1st International Symposium on Cluster Computing and the Grid
Tertiary Disk: Large Scale Distributed Storage

Tertiary Disk: Large Scale Distributed Storage
Tapestry: An Infrastructure for Fault-tolerant Wide-area Location and

Tapestry: An Infrastructure for Fault-tolerant Wide-area Location and
Recovery Oriented Computing (ROC): Motivation, Definition, Techniques,

Recovery Oriented Computing (ROC): Motivation, Definition, Techniques,
The Google file system

SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Farsite: federated, available, and reliable storage for an incompletely trusted environment

OSDI '02 Proceedings of the 5th symposium on Operating systems design and implementationCopyright restrictions prevent ACM from being able to make the PDFs for this conference available for downloading
Ivy: a read/write peer-to-peer file system

OSDI '02 Proceedings of the 5th symposium on Operating systems design and implementationCopyright restrictions prevent ACM from being able to make the PDFs for this conference available for downloading
Pastiche: making backup cheap and easy

OSDI '02 Proceedings of the 5th symposium on Operating systems design and implementationCopyright restrictions prevent ACM from being able to make the PDFs for this conference available for downloading
On the Impact of Replica Placement to the Reliability of Distributed Brick Storage Systems

ICDCS '05 Proceedings of the 25th IEEE International Conference on Distributed Computing Systems
RepStore: A Self-Managing and Self-Tuning Storage Backend with Smart Bricks

ICAC '04 Proceedings of the First International Conference on Autonomic Computing
Storage Bricks Have Arrived

FAST '02 Proceedings of the 1st USENIX Conference on File and Storage Technologies
Awarded Best Student Paper! - Pond: The OceanStore Prototype

FAST '03 Proceedings of the 2nd USENIX Conference on File and Storage Technologies
Simulating Large-Scale P2P Systems with the WiDS Toolkit

MASCOTS '05 Proceedings of the 13th IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems
FAB: enterprise storage systems on a shoestring

HOTOS'03 Proceedings of the 9th conference on Hot Topics in Operating Systems - Volume 9
WiDS: an integrated toolkit for distributed system development

HOTOS'05 Proceedings of the 10th conference on Hot Topics in Operating Systems - Volume 10
Total recall: system support for automated availability management

NSDI'04 Proceedings of the 1st conference on Symposium on Networked Systems Design and Implementation - Volume 1
Boxwood: abstractions as the foundation for storage infrastructure

OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Using random subsets to build scalable network services

USITS'03 Proceedings of the 4th conference on USENIX Symposium on Internet Technologies and Systems - Volume 4
WiDS checker: combating bugs in distributed systems

NSDI'07 Proceedings of the 4th USENIX conference on Networked systems design & implementation
A practical distributed mutual exclusion protocol in dynamic peer-to-peer systems

IPTPS'04 Proceedings of the Third international conference on Peer-to-Peer Systems

POTSHARDS—a secure, recoverable, long-term archival storage system

ACM Transactions on Storage (TOS)
A DHT Key-Value Storage System with Carrier Grade Performance

Euro-Par '09 Proceedings of the 15th International Euro-Par Conference on Parallel Processing
Analysis of Workload Behavior in Scientific and Historical Long-Term Data Repositories

ACM Transactions on Storage (TOS)

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper summarizes our experience designing and implementing BitVault: a content-addressable retention platform for large volumes of reference data -- seldom-changing information that needs to be retained for a long time. BitVault uses "smart bricks" as the building block to lower the hardware cost. The challenges are to keep management costs low in a system that scales from one brick to tens of thousands, to ensure reliability, and to deliver a simple design. Our design incorporates peer-to-peer (P2P) technologies for self-managing and self-healing and uses massively parallel repair to reduce system vulnerability to data loss. The simplicity of the architecture relies on an eventually reliable membership service provided by a perfect one-hop distributed hash table (DHT). Its object-driven repair model yields last-replica recall guarantee independent of the failure scenario. So long as the last copy of a data object remains in the system, that data can be retrieved and its replication degree can be restored. A prototype has been implemented. Theoretical analysis, simulations and experiments have been conducted to validate the design of BitVault.