Hierarchical Bloom filter arrays (HBA): a novel, scalable metadata management system for large cluster-based storage

  • Authors:
  • Yifeng Zhu;Hong Jiang;J. Wang

  • Affiliations:
  • Dept. of Comput. Sci. & Eng., Nebraska Univ., Lincoln, NE, USA;Dept. of Comput. Sci. & Eng., Nebraska Univ., Lincoln, NE, USA;Dept. of Comput. Sci. & Eng., Nebraska Univ., Lincoln, NE, USA

  • Venue:
  • CLUSTER '04 Proceedings of the 2004 IEEE International Conference on Cluster Computing
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

An efficient and distributed scheme for file mapping or file lookup scheme is critical in decentralizing metadata management within a group of metadata servers. This work presents a technique called HBA (hierarchical Bloom filter arrays) to map file names to the servers holding their metadata. Two levels of probabilistic arrays, i.e., Bloom filter arrays, with different accuracies are used on each metadata server. One array, with lower accuracy and representing the distribution of the entire metadata, trades accuracy for significantly reduced memory overhead, while the other array, with higher accuracy, caches partial distribution information and exploits the temporal locality of file access patterns. Extensive trace-driven simulations have shown our HBA design to be highly effective and efficient in improving performance and scalability of file systems in clusters with 1,000 to 10,000 nodes (or superclusters).