Dynamic Metadata Management for Petabyte-Scale File Systems
Proceedings of the 2004 ACM/IEEE conference on Supercomputing
A comparison of file system workloads
ATEC '00 Proceedings of the annual conference on USENIX Annual Technical Conference
A five-year study of file-system metadata
ACM Transactions on Storage (TOS)
Ceph: a scalable, high-performance distributed file system
OSDI '06 Proceedings of the 7th symposium on Operating systems design and implementation
HBA: Distributed Metadata Management for Large Cluster-Based Storage Systems
IEEE Transactions on Parallel and Distributed Systems
Generating realistic impressions for file-system benchmarking
FAST '09 Proccedings of the 7th conference on File and storage technologies
An Empirical Analysis of Personal Digital Document Structures
Proceedings of the Symposium on Human Interface 2009 on ConferenceUniversal Access in Human-Computer Interaction. Part I: Held as Part of HCI International 2009
MHS: A distributed metadata management strategy
Journal of Systems and Software
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
hashFS: Applying Hashing to Optimize File Systems for Small File Reads
SNAPI '10 Proceedings of the 2010 International Workshop on Storage Network Architecture and Parallel I/Os
Hierarchical file systems are dead
HotOS'09 Proceedings of the 12th conference on Hot topics in operating systems
The Hadoop Distributed File System
MSST '10 Proceedings of the 2010 IEEE 26th Symposium on Mass Storage Systems and Technologies (MSST)
Scale and concurrency of GIGA+: file system directories with millions of files
FAST'11 Proceedings of the 9th USENIX conference on File and stroage technologies
Parallel I/O and the metadata wall
Proceedings of the sixth workshop on Parallel Data Storage
Just-in-Time Analytics on Large File Systems
IEEE Transactions on Computers
Hi-index | 0.00 |
As large file systems increasingly grow in size, metadata operations become one of the major performance bottlenecks that constrain the overall I/O performance. Previous analysis on I/O workloads shows the file lookup operation makes up a large proportion of metadata operations. Existing optimizations for lookup operations such as MHS method employ the directory lookup table (DLT) to avoid directory traversal. However, the inefficient design of DLT produces large amount of storage cost and rename overhead, not suitable for large file systems. In this paper, we present a cost-effective file lookup service (CEFLS) for a distributed metadata file system. Our method benefits from efficient partition method and structures to increase the cache efficiency for DLT. Extensive simulations show that the percentages of cached directories with CELFS can be increased by factors of up to 305 and 279 percent compared with MHS when the cache size on each metadata server is configured as 1GB and 2GB, respectively. Meanwhile, CELFS can also significantly reduce the average latency for both file lookup and directory rename operations.