The design and implementation of a log-structured file system
ACM Transactions on Computer Systems (TOCS)
The Zebra striped network file system
ACM Transactions on Computer Systems (TOCS)
The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Feasibility of a serverless distributed file system deployed on an existing set of desktop PCs
Proceedings of the 2000 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Reliability and performance of hierarchical RAID with multiple controllers
Proceedings of the twentieth annual ACM symposium on Principles of distributed computing
A low-bandwidth network file system
SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
Compactly encoding unstructured inputs with differential compression
Journal of the ACM (JACM)
Venti: A New Approach to Archival Storage
FAST '02 Proceedings of the Conference on File and Storage Technologies
Erasure Coding Vs. Replication: A Quantitative Comparison
IPTPS '01 Revised Papers from the First International Workshop on Peer-to-Peer Systems
Massive arrays of idle disks for storage archives
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
RAID-x: A New Distributed Disk Array for I/O-Centric Cluster Computing
HPDC '00 Proceedings of the 9th IEEE International Symposium on High Performance Distributed Computing
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Evaluation of Distributed Recovery in Large-Scale Storage Systems
HPDC '04 Proceedings of the 13th IEEE International Symposium on High Performance Distributed Computing
Providing High Reliability in a Minimum Redundancy Archival Storage System
MASCOTS '06 Proceedings of the 14th IEEE International Symposium on Modeling, Analysis, and Simulation
CRUSH: controlled, scalable, decentralized placement of replicated data
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Determining Fault Tolerance of XOR-Based Erasure Codes Efficiently
DSN '07 Proceedings of the 37th Annual IEEE/IFIP International Conference on Dependable Systems and Networks
Ceph: a scalable, high-performance distributed file system
OSDI '06 Proceedings of the 7th USENIX Symposium on Operating Systems Design and Implementation - Volume 7
Disk failures in the real world: what does an MTTF of 1,000,000 hours mean to you?
FAST '07 Proceedings of the 5th USENIX conference on File and Storage Technologies
Implementation and performance evaluation of fuzzy file block matching
ATC'07 2007 USENIX Annual Technical Conference on Proceedings of the USENIX Annual Technical Conference
HPDC '08 Proceedings of the 17th international symposium on High performance distributed computing
ADMAD: Application-Driven Metadata Aware De-duplication Archival Storage System
SNAPI '08 Proceedings of the 2008 Fifth IEEE International Workshop on Storage Network Architecture and Parallel I/Os
A performance evaluation and examination of open-source erasure coding libraries for storage
FAST '09 Proccedings of the 7th conference on File and storage technologies
Efficient erasure correcting codes
IEEE Transactions on Information Theory
Reliability analysis of deduplicated and erasure-coded storage
ACM SIGMETRICS Performance Evaluation Review
SAFE: A Source Deduplication Framework for Efficient Cloud Backup Services
Journal of Signal Processing Systems
Hi-index | 0.00 |
Data de-duplication has become a commodity component in data-intensive systems and it is required that these systems provide high reliability comparable to others. Unfortunately, by storing duplicate data chunks just once, de-duped system improves storage utilization at cost of error resilience or reliability. In this paper, R-ADMAD, a high reliability provision mechanism is proposed. It packs variable-length data chunks into fixed sized objects, and exploits ECC codes to encode the objects and distributes them among the storage nodes in a redundancy group, which is dynamically generated according to current status and actual failure domains. Upon failures, R-ADMAD proposes a distributed and dynamic recovery process. Experimental results show that R-ADMAD can provide the same storage utilization as RAID-like schemes, but comparable reliability to replication based schemes with much more redundancy. The average recovery time of R-ADMAD based configurations is about 2-6 times less than RAID-like schemes. Moreover, R-ADMAD can provide dynamic load balancing even without the involvement of the overloaded storage nodes.