Using server-to-server communication in parallel file systems to simplify consistency and improve performance

Authors:
Philip H. Carns;Bradley W. Settlemyer;Walter B. Ligon, III
Affiliations:
Argonne National Laboratory, Argonne, IL;Clemson University, Clemson, SC;Clemson University, Clemson, SC
Venue:
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Year:
2008

Citing 8
Cited 4

GPFS: A Shared-Disk File System for Large Computing Clusters

FAST '02 Proceedings of the Conference on File and Storage Technologies
Dynamic Metadata Management for Petabyte-Scale File Systems

Proceedings of the 2004 ACM/IEEE conference on Supercomputing
The Case of the Missing Supercomputer Performance: Achieving Optimal Performance on the 8,192 Processors of ASCI Q

Proceedings of the 2003 ACM/IEEE conference on Supercomputing
Achieving scalability in parallel file systems

Achieving scalability in parallel file systems
Scalability of TCP Servers, Handling Persistent Connections

ICN '07 Proceedings of the Sixth International Conference on Networking
Ceph: a scalable, high-performance distributed file system

OSDI '06 Proceedings of the 7th symposium on Operating systems design and implementation
A Technique for Lock-Less Mirroring in Parallel File Systems

CCGRID '08 Proceedings of the 2008 Eighth IEEE International Symposium on Cluster Computing and the Grid
Scalable locking and recovery for network file systems

PDSW '07 Proceedings of the 2nd international workshop on Petascale data storage: held in conjunction with Supercomputing '07

PetaShare: A reliable, efficient and transparent distributed storage management system

Scientific Programming
Parallel file system measurement and modeling using colored petri nets

ICPE '12 Proceedings of the 3rd ACM/SPEC International Conference on Performance Engineering
A dynamic and adaptive load balancing strategy for parallel file system with large-scale I/O servers

Journal of Parallel and Distributed Computing
A New File-Specific Stripe Size Selection Method for Highly Concurrent Data Access

GRID '12 Proceedings of the 2012 ACM/IEEE 13th International Conference on Grid Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

The trend in parallel computing toward clusters running thousands of cooperating processes per application has led to an I/O bottleneck that has only gotten more severe as the CPU density of clusters has increased. Current parallel file systems provide large amounts of aggregate I/O bandwidth; however, they do not achieve the high degrees of metadata scalability required to manage files distributed across hundreds or thousands of storage nodes. In this paper we examine the use of collective communication between the storage servers to improve the scalability of file metadata operations. In particular, we apply server-to-server communication to simplify consistency checking and improve the performance of file creation, file removal, and file stat. Our results indicate that collective communication is an effective scheme for simplifying consistency checks and significantly improving the performance for several real metadata intensive workloads.