Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Adaptive and scalable metadata management to support a trillion files
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
BF-chord: an improved lookup protocol to chord based on Bloom Filter for wireless P2P
WiCOM'09 Proceedings of the 5th International Conference on Wireless communications, networking and mobile computing
Just-in-time analytics on large file systems
FAST'11 Proceedings of the 9th USENIX conference on File and stroage technologies
CEFLS: A Cost-Effective File Lookup Service in a Distributed Metadata File System
CCGRID '12 Proceedings of the 2012 12th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012)
A fast indexing algorithm optimization with user behavior pattern
ICPCA/SWS'12 Proceedings of the 2012 international conference on Pervasive Computing and the Networked World
Two-level Hash/Table approach for metadata management in distributed file systems
The Journal of Supercomputing
Direct lookup and hash-based metadata placement for local file systems
Proceedings of the 6th International Systems and Storage Conference
Hi-index | 0.00 |
An efficient and distributed scheme for file mapping or file lookup is critical in decentralizing metadata management within a group of metadata servers. This paper presents a novel technique called HBA (Hierarchical Bloom filter Arrays) to map filenames to the metadata servers holding their metadata. Two levels of probabilistic arrays, namely, Bloom filter arrays, with different level of accuracies, are used on each metadata server. One array, with lower accuracy and representing the distribution of the entire metadata, trades accuracy for significantly reduced memory overhead, while the other array, with higher accuracy, caches partial distribution information and exploits the temporal locality of file access patterns. Both arrays are replicated to all metadata servers to support fast local lookups. We evaluate HBA through extensive trace-driven simulations and an implementation in Linux. Simulation results show our HBA design to be highly effective and efficient in improving performance and scalability of file systems in clusters with 1,000 to 10,000 nodes (or super-clusters) and with the amount of data in the Peta-byte scale or higher. Our implementation indicates that HBA can reduce metadata operation time of a single-metadata-server architecture by a factor of up to 43.9 when the system is configured with 16 metadata servers.