R-ADMAD: high reliability provision for large-scale de-duplication archival storage systems

Authors:
Chuanyi Liu;Yu Gu;Linchun Sun;Bin Yan;Dongsheng Wang
Affiliations:
Tsinghua University, Beijing, China;Tsinghua University, Beijing, China;Tsinghua University, Beijing, China;Tsinghua University, Beijing, China;Tsinghua University, Beijing, China
Venue:
Proceedings of the 23rd international conference on Supercomputing
Year:
2009

Citing 23
Cited 2

The design and implementation of a log-structured file system

ACM Transactions on Computer Systems (TOCS)
The Zebra striped network file system

ACM Transactions on Computer Systems (TOCS)
The anatomy of a large-scale hypertextual Web search engine

WWW7 Proceedings of the seventh international conference on World Wide Web 7
Feasibility of a serverless distributed file system deployed on an existing set of desktop PCs

Proceedings of the 2000 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Reliability and performance of hierarchical RAID with multiple controllers

Proceedings of the twentieth annual ACM symposium on Principles of distributed computing
A low-bandwidth network file system

SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
Compactly encoding unstructured inputs with differential compression

Journal of the ACM (JACM)
Venti: A New Approach to Archival Storage

FAST '02 Proceedings of the Conference on File and Storage Technologies
Erasure Coding Vs. Replication: A Quantitative Comparison

IPTPS '01 Revised Papers from the First International Workshop on Peer-to-Peer Systems
Massive arrays of idle disks for storage archives

Proceedings of the 2002 ACM/IEEE conference on Supercomputing
RAID-x: A New Distributed Disk Array for I/O-Centric Cluster Computing

HPDC '00 Proceedings of the 9th IEEE International Symposium on High Performance Distributed Computing
The Google file system

SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Evaluation of Distributed Recovery in Large-Scale Storage Systems

HPDC '04 Proceedings of the 13th IEEE International Symposium on High Performance Distributed Computing
Providing High Reliability in a Minimum Redundancy Archival Storage System

MASCOTS '06 Proceedings of the 14th IEEE International Symposium on Modeling, Analysis, and Simulation
CRUSH: controlled, scalable, decentralized placement of replicated data

Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Determining Fault Tolerance of XOR-Based Erasure Codes Efficiently

DSN '07 Proceedings of the 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks
Ceph: a scalable, high-performance distributed file system

OSDI '06 Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation - Volume 7
Disk failures in the real world: what does an MTTF of 1,000,000 hours mean to you?

FAST '07 Proceedings of the 5th USENIX conference on File and Storage Technologies
Implementation and performance evaluation of fuzzy file block matching

ATC'07 2007 USENIX Annual Technical Conference on Proceedings of the USENIX Annual Technical Conference
Evaluating the usefulness of content addressable storage for high-performance data intensive applications

HPDC '08 Proceedings of the 17th international symposium on High performance distributed computing
ADMAD: Application-Driven Metadata Aware De-duplication Archival Storage System

SNAPI '08 Proceedings of the 2008 Fifth IEEE International Workshop on Storage Network Architecture and Parallel I/Os
A performance evaluation and examination of open-source erasure coding libraries for storage

FAST '09 Proccedings of the 7th conference on File and storage technologies
Efficient erasure correcting codes

IEEE Transactions on Information Theory

Reliability analysis of deduplicated and erasure-coded storage

ACM SIGMETRICS Performance Evaluation Review
SAFE: A Source Deduplication Framework for Efficient Cloud Backup Services

Journal of Signal Processing Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Data de-duplication has become a commodity component in data-intensive systems and it is required that these systems provide high reliability comparable to others. Unfortunately, by storing duplicate data chunks just once, de-duped system improves storage utilization at cost of error resilience or reliability. In this paper, R-ADMAD, a high reliability provision mechanism is proposed. It packs variable-length data chunks into fixed sized objects, and exploits ECC codes to encode the objects and distributes them among the storage nodes in a redundancy group, which is dynamically generated according to current status and actual failure domains. Upon failures, R-ADMAD proposes a distributed and dynamic recovery process. Experimental results show that R-ADMAD can provide the same storage utilization as RAID-like schemes, but comparable reliability to replication based schemes with much more redundancy. The average recovery time of R-ADMAD based configurations is about 2-6 times less than RAID-like schemes. Moreover, R-ADMAD can provide dynamic load balancing even without the involvement of the overloaded storage nodes.