An Efficient Data Location Protocol for Self.organizing Storage Clusters

  • Authors:
  • Hong Tang;Tao Yang

  • Affiliations:
  • University of California, Santa Barbara;University of California, Santa Barbara

  • Venue:
  • Proceedings of the 2003 ACM/IEEE conference on Supercomputing
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

Component additions and failures are common for large-scale storage clusters in production environments. To improve availability and manageability, we investigate and compare data location schemes for a large self-organizing storage cluster that can quickly adapt to the additions or departures of storage nodes. We further present an efficient location scheme that differentiates between small and large file blocks for reduced management overhead compared to uniform strategies. In our protocol, small blocks, which are typically in large quantities, are placed through consistent hashing. Large blocks, much fewer in practice, are placed through a usage-based policy, and their locations are tracked by Bloom filters. The proposed scheme results in improved storage utilization even with non-uniform cluster nodes. To achieve high scalability and fault resilience, this protocol is fully distributed, relies only on soft states, and supports data replication. We demonstrate the effectiveness and efficiency of this protocol through trace-driven simulation.