Mapping Functions and Data Redistribution for Parallel Files
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Integrating collective I/O and cooperative caching into the "clusterfile" parallel file system
Proceedings of the 18th annual international conference on Supercomputing
Direct-pNFS: scalable, transparent, and versatile access to parallel file systems
Proceedings of the 16th international symposium on High performance distributed computing
A case study of parallel I/O for biological sequence search on Linux clusters
International Journal of High Performance Computing and Networking
Mapping functions and data redistribution for parallel files
The Journal of Supercomputing
Algorithms for memory hierarchies: advanced lectures
Algorithms for memory hierarchies: advanced lectures
A Scalable Message Passing Interface Implementation of an Ad-Hoc Parallel I/o system
International Journal of High Performance Computing Applications
A cost-intelligent application-specific data layout scheme for parallel file systems
Proceedings of the 20th international symposium on High performance distributed computing
Paradis-Net: a network interface for parallel and distributed
ICN'05 Proceedings of the 4th international conference on Networking - Volume Part II
Hi-index | 0.00 |
This paper presents Clusterfile, a parallel file system that provides parallel file access on a cluster of computers. Existing parallel file systems offer little control over matching the I/O access patterns and file data layout. Without this matching the applications may face the following problems: contention at I/O nodes, fragmentation of file data, false sharing, small network messages, high overhead of scattering/gathering the data. Clusterfile addresses some of these inefficiencies. Parallel applications can physically partition a file in arbitrary patterns. They can also set arbitrary views on a file. Views hide the parallel structure of the file and ease the programmer's burden of computing complex access indices. The intersections between views and layouts are computed by a memory redistribution algorithm. Read and write operations are optimized by pre-computing the direct mapping between access patterns and disks. Clusterfile uses the same data representation for file layouts, access patterns, and the mappings between each other.