The Sprite Network Operating System
Computer
The Vesta parallel file system
ACM Transactions on Computer Systems (TOCS)
RAMA: an easy-to-use, high-performance parallel file system
Parallel Computing - Special double issue: parallel I/O
Extendible hashing—a fast access method for dynamic files
ACM Transactions on Database Systems (TODS)
GPFS: A Shared-Disk File System for Large Computing Clusters
FAST '02 Proceedings of the Conference on File and Storage Technologies
zFS " A Scalable Distributed File System Using Object Disks
MSS '03 Proceedings of the 20 th IEEE/11 th NASA Goddard Conference on Mass Storage Systems and Technologies (MSS'03)
SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Dynamic Metadata Management for Petabyte-Scale File Systems
Proceedings of the 2004 ACM/IEEE conference on Supercomputing
Scalability in the XFS file system
ATEC '96 Proceedings of the 1996 annual conference on USENIX Annual Technical Conference
ALS '01 Proceedings of the 5th annual Linux Showcase & Conference - Volume 5
Embedded inodes and explicit grouping: exploiting disk bandwidth for small files
ATEC '97 Proceedings of the annual conference on USENIX Annual Technical Conference
Distributed directory service in the Farsite file system
OSDI '06 Proceedings of the 7th symposium on Operating systems design and implementation
HBA: Distributed Metadata Management for Large Cluster-Based Storage Systems
IEEE Transactions on Parallel and Distributed Systems
GIGA+: scalable directories for shared file systems
PDSW '07 Proceedings of the 2nd international workshop on Petascale data storage: held in conjunction with Supercomputing '07
Scalable and Adaptive Metadata Management in Ultra Large-Scale File Systems
ICDCS '08 Proceedings of the 2008 The 28th International Conference on Distributed Computing Systems
Managing Variability in the IO Performance of Petascale Storage Systems
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
Catalogue manager for metadata dissemination in the NetTraveler middleware system
International Journal of Intelligent Information and Database Systems
Scientific data services: a high-performance I/O system with array semantics
Proceedings of the first annual workshop on High performance computing meets databases
An approach for indexing file names in a directory
Proceedings of the 13th International Conference on Computer Systems and Technologies
Two-level Hash/Table approach for metadata management in distributed file systems
The Journal of Supercomputing
Hi-index | 0.00 |
Nowadays more and more applications require file systems to efficiently maintain million or more files. How to provide high access performance with such a huge number of files and such large directories is a big challenge for cluster file systems. Limited by static directory structures, existing file systems will be prohibitively inefficient for this use. To address this problem, we present a scalable and adaptive metadata management system which aims to maintain a trillion files efficiently. Firstly, our system exploits an adaptive two-level directory partitioning based on extendible hashing to manage very large directories. Secondly, our system utilizes fine-grained parallel processing within a directory and greatly improves performance of file creation or deletion. Thirdly, our system uses multiple-layered metadata cache management which improves memory utilization on the servers. And finally, our system uses a dynamic loadbalance mechanism based on consistent hashing which enables our system to scale up and down easily. Our performance results on 32 metadata servers show that our user-level prototype implementation can create more than 74 thousand files per second and can get more than 270 thousand files' attributes per second in a single directory with 100 million files. Moreover, it delivers a peak throughput of more than 60 thousand file creates/second in a single directory with 1 billion files.