OceanStore: an architecture for global-scale persistent storage
ASPLOS IX Proceedings of the ninth international conference on Architectural support for programming languages and operating systems
Wide-area cooperative storage with CFS
SOSP '01 Proceedings of the eighteenth ACM symposium on Operating systems principles
Erasure Coding Vs. Replication: A Quantitative Comparison
IPTPS '01 Revised Papers from the First International Workshop on Peer-to-Peer Systems
PAST: A Large-Scale, Persistent Peer-to-Peer Storage Utility
HOTOS '01 Proceedings of the Eighth Workshop on Hot Topics in Operating Systems
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Erasure Code Replication Revisited
P2P '04 Proceedings of the Fourth International Conference on Peer-to-Peer Computing
Farsite: federated, available, and reliable storage for an incompletely trusted environment
OSDI '02 Proceedings of the 5th symposium on Operating systems design and implementationCopyright restrictions prevent ACM from being able to make the PDFs for this conference available for downloading
Erasure Codes for Increasing the Availability of Grid Data Storage
AICT-ICIW '06 Proceedings of the Advanced Int'l Conference on Telecommunications and Int'l Conference on Internet and Web Applications and Services
Internet-Scale Storage Systems under Churn -- A Study of the Steady-State using Markov Models
P2P '06 Proceedings of the Sixth IEEE International Conference on Peer-to-Peer Computing
A heterogeneous storage grid enabled by grid service
ACM SIGOPS Operating Systems Review
High availability, scalable storage, dynamic peer networks: pick two
HOTOS'03 Proceedings of the 9th conference on Hot Topics in Operating Systems - Volume 9
Total recall: system support for automated availability management
NSDI'04 Proceedings of the 1st conference on Symposium on Networked Systems Design and Implementation - Volume 1
Glacier: highly durable, decentralized storage despite massive correlated failures
NSDI'05 Proceedings of the 2nd conference on Symposium on Networked Systems Design & Implementation - Volume 2
Efficient replica maintenance for distributed storage systems
NSDI'06 Proceedings of the 3rd conference on Networked Systems Design & Implementation - Volume 3
PeerStripe: a p2p-based large-file storage for desktop grids
Proceedings of the 16th international symposium on High performance distributed computing
Characterizing residential broadband networks
Proceedings of the 7th ACM SIGCOMM conference on Internet measurement
Stochastic analysis of the interplay between object maintenance and churn
Computer Communications
A Practical Study of Regenerating Codes for Peer-to-Peer Backup Systems
ICDCS '09 Proceedings of the 2009 29th IEEE International Conference on Distributed Computing Systems
DiskReduce: RAID for data-intensive scalable computing
Proceedings of the 4th Annual Workshop on Petascale Data Storage
The Failure Trace Archive: Enabling Comparative Analysis of Failures in Diverse Distributed Systems
CCGRID '10 Proceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing
Nebulas: using distributed voluntary resources to build clouds
HotCloud'09 Proceedings of the 2009 conference on Hot topics in cloud computing
Availability in globally distributed storage systems
OSDI'10 Proceedings of the 9th USENIX conference on Operating systems design and implementation
A taxonomy of peer-to-peer desktop grid paradigms
Cluster Computing
High availability in DHTs: erasure coding vs. replication
IPTPS'05 Proceedings of the 4th international conference on Peer-to-Peer Systems
Hi-index | 0.00 |
High performance computing can be well supported by the Grid or cloud computing systems. However, these systems have to overcome the failure risks, where data is stored in the "unreliable" storage nodes that can leave the system at any moment and the nodes' network bandwidth is limited. In this case, the basic way to assure data reliability is to add redundancy using either replication or erasure codes. As compared to replication, erasure codes are more space efficient. Erasure codes break data into blocks, encode these blocks and distribute them into different storage nodes. When storage nodes permanently or temporarily abandon the system, new redundant blocks must be created to guarantee the data reliability, which is referred to as repair. Later when the churn nodes rejoin the system, the blocks stored in these nodes can reintegrate the data group to enhance the data reliability. For "classical" erasure codes, generating a new block requires to transmit a number of k blocks over the network, which brings lots of repair traffic, high computation complexity and high failure probability for the repair process. Then a near-optimal erasure code named Hierarchical Codes, has been proposed that can significantly reduce the repair traffic by reducing the number of nodes participating in the repair process, which is referred to as the repair degree d. To overcome the complexity of reintegration and provide an adaptive reliability for Hierarchical Codes, we refine two concepts called location and relocation, and then propose an integrated maintenance scheme for the repair process. Our experiments show that Hierarchical Code is the most robust redundancy scheme for the repair process as compared to other famous coding schemes.