Supporting Load Balancing and Efficient Reorganization During System Scaling

  • Authors:
  • Feng Zhu;Xiaowei Sun;Betty Salzberg;Svein-Olaf Hvasshovd

  • Affiliations:
  • Northeastern University, Boston, MA;Northeastern University, Boston, MA;Northeastern University, Boston, MA;Norwegian Institute of Science and Technology, Trondheim, Norway

  • Venue:
  • IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Reorganization becomes constantly necessary for maintaining load balancing as distributed storage systems scale up and down. To support load balancing and efficient reorganization during system scaling, we propose a new hashing method called Prime Based Hashing (PBH) that can be used for data allocation in large distributed systems. PBH distributes objects among storage units based on residues (congruence) of hash-transformed key values modulo prime numbers. PBH provides nearly perfect load balancing, distributes objects evenly and rebalances to preserve the even distribution as system scales. At the same time it facilitates cost-effective reorganization by minimizing data migration during system scaling. Locating an object in PBH is fast through low complexity computations, requiring only the knowledge of the total number of storage units. We also propose a local data clustering method to couple with PBH to make reorganizationmore efficient. Objects are clustered according to the order of migration so that only the part of the data that needs to be migrated is scanned. In addition, we show that by storing a small amount of pre-computed information, ordering of objects for clustering can be very efficient. We demonstrate through analysis and experiments the effectiveness of our algorithms.