LH*—a scalable, distributed data structure
ACM Transactions on Database Systems (TODS)
STOC '97 Proceedings of the twenty-ninth annual ACM symposium on Theory of computing
ACM Computing Surveys (CSUR)
LH*RS: a high-availability scalable distributed data structure using Reed Solomon Codes
SIGMOD '00 Proceedings of the 2000 ACM SIGMOD international conference on Management of data
Efficient, distributed data placement strategies for storage area networks (extended abstract)
Proceedings of the twelfth annual ACM symposium on Parallel algorithms and architectures
Reconciling simplicity and realism in parallel disk models
SODA '01 Proceedings of the twelfth annual ACM-SIAM symposium on Discrete algorithms
Compact, adaptive placement schemes for non-uniform requirements
Proceedings of the fourteenth annual ACM symposium on Parallel algorithms and architectures
LH*G: A High-Availability Scalable Distributed Data Structure By Record Grouping
IEEE Transactions on Knowledge and Data Engineering
High-Availability LH* Schemes with Mirroring
COOPIS '96 Proceedings of the First IFCIS International Conference on Cooperative Information Systems
A Fast Algorithm for Online Placement and Reorganization of Replicated Data
IPDPS '03 Proceedings of the 17th International Symposium on Parallel and Distributed Processing
Weighted distributed hash tables
Proceedings of the seventeenth annual ACM symposium on Parallelism in algorithms and architectures
Dynamic and Redundant Data Placement
ICDCS '07 Proceedings of the 27th International Conference on Distributed Computing Systems
Data replication in p2p environments
Proceedings of the twentieth annual symposium on Parallelism in algorithms and architectures
Redundant Data Placement Strategies for Cluster Storage Environments
OPODIS '08 Proceedings of the 12th International Conference on Principles of Distributed Systems
Hi-index | 0.00 |
In this paper we study the problem of designing an adaptive hash table for redundant data storage in a system of storage devices with arbitrary capacities. Ideally, such a hash table should make sure that (a) a storage device with x% of the available capacity should get x% of the data, (b) the copies of each data item are distributed among the storage devices so that no two copies are stored at the same device, and (c) only a near-minimum amount of data replacements is necessary to preserve (a) and (b) under any change in the system. Hash tables satisfying (a) and (c) are already known, and it is not difficult to construct hash tables satisfying (a) and (b). However, no hash table is known so far that can satisfy all three properties as long as this is in principle possible. We present a strategy called SPREAD that solves this problem for the first time. As long as (a) and (b) can in principle be satisfied, SPREAD preserves (a) for every storage device within a (1 ± ε) factor, with high probability, where ε 0 can be made arbitrarily small, guarantees (b) for every data item, and only needs a constant factor more data replacements than minimum possible in order to preserve (a) and (b).