LH*—a scalable, distributed data structure
ACM Transactions on Database Systems (TODS)
STOC '97 Proceedings of the twenty-ninth annual ACM symposium on Theory of computing
Extendible hashing—a fast access method for dynamic files
ACM Transactions on Database Systems (TODS)
Chord: A scalable peer-to-peer lookup service for internet applications
Proceedings of the 2001 conference on Applications, technologies, architectures, and protocols for computer communications
Pastry: Scalable, Decentralized Object Location, and Routing for Large-Scale Peer-to-Peer Systems
Middleware '01 Proceedings of the IFIP/ACM International Conference on Distributed Systems Platforms Heidelberg
Application-Level Multicast Using Content-Addressable Networks
NGC '01 Proceedings of the Third International COST264 Workshop on Networked Group Communication
Tapestry: An Infrastructure for Fault-tolerant Wide-area Location and
Tapestry: An Infrastructure for Fault-tolerant Wide-area Location and
Virtual hashing: a dynamically changing hashing
VLDB '78 Proceedings of the fourth international conference on Very Large Data Bases - Volume 4
Linear hashing: a new tool for file and table addressing
VLDB '80 Proceedings of the sixth international conference on Very Large Data Bases - Volume 6
Linear hashing with partial expansions
VLDB '80 Proceedings of the sixth international conference on Very Large Data Bases - Volume 6
Querying the internet with PIER
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
New Balanced Data Allocating and Online Migrating Algorithms in Database Cluster
APWeb/WAIM '09 Proceedings of the Joint International Conferences on Advances in Data and Web Management
CEA: A Cyclic Expansion Algorithm for data migration in parallel video servers
Journal of Parallel and Distributed Computing
Hi-index | 0.00 |
Reorganization becomes constantly necessary for maintaining load balancing as distributed storage systems scale up and down. To support load balancing and efficient reorganization during system scaling, we propose a new hashing method called Prime Based Hashing (PBH) that can be used for data allocation in large distributed systems. PBH distributes objects among storage units based on residues (congruence) of hash-transformed key values modulo prime numbers. PBH provides nearly perfect load balancing, distributes objects evenly and rebalances to preserve the even distribution as system scales. At the same time it facilitates cost-effective reorganization by minimizing data migration during system scaling. Locating an object in PBH is fast through low complexity computations, requiring only the knowledge of the total number of storage units. We also propose a local data clustering method to couple with PBH to make reorganizationmore efficient. Objects are clustered according to the order of migration so that only the part of the data that needs to be migrated is scanned. In addition, we show that by storing a small amount of pre-computed information, ordering of objects for clustering can be very efficient. We demonstrate through analysis and experiments the effectiveness of our algorithms.