Using MPI file caching to improve parallel write performance for large-scale scientific applications

Authors:
Wei-keng Liao;Avery Ching;Kenin Coloma;Arifa Nisar;Alok Choudhary;Jacqueline Chen;Ramanan Sankaran;Scott Klasky
Affiliations:
Northwestern University, Evanston, Illinois;Northwestern University, Evanston, Illinois;Northwestern University, Evanston, Illinois;Northwestern University, Evanston, Illinois;Northwestern University, Evanston, Illinois;Sandia National Laboratories, Livermore, California;Oak Ridge National Laboratory, Oak Ridge, Tennessee;Oak Ridge National Laboratory, Oak Ridge, Tennessee
Venue:
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Year:
2007

Citing 16
Cited 3

Concurrency control and recovery in database systems

Concurrency control and recovery in database systems
PPFS: a high performance portable parallel file system

ICS '95 Proceedings of the 9th international conference on Supercomputing
An extended two-phase method for accessing sections of out-of-core arrays

Scientific Programming
Implementing cooperative prefetching and caching in a globally-managed memory system

SIGMETRICS '98/PERFORMANCE '98 Proceedings of the 1998 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
On implementing MPI-IO portably and with high performance

Proceedings of the sixth workshop on I/O in parallel and distributed systems
MPI-IO/GPFS, an optimized implementation of MPI-IO on top of GPFS

Proceedings of the 2001 ACM/IEEE conference on Supercomputing
GPFS: A Shared-Disk File System for Large Computing Clusters

FAST '02 Proceedings of the Conference on File and Storage Technologies
PACA: A Cooperative File System Cache for Parallel Machines

Euro-Par '96 Proceedings of the Second International Euro-Par Conference on Parallel Processing - Volume I
An Abstract-Device Interface for Implementing Portable Parallel-I/O Interfaces

FRONTIERS '96 Proceedings of the 6th Symposium on the Frontiers of Massively Parallel Computation
PMPIO - A Portable Implementation of MPI-IO

FRONTIERS '96 Proceedings of the 6th Symposium on the Frontiers of Massively Parallel Computation
Improving MPI-IO Output Performance with Active Buffering Plus Threads

IPDPS '03 Proceedings of the 17th International Symposium on Parallel and Distributed Processing
Integrating collective I/O and cooperative caching into the "clusterfile" parallel file system

Proceedings of the 18th annual international conference on Supercomputing
Implementing MPI-IO atomic mode without file system support

CCGRID '05 Proceedings of the Fifth IEEE International Symposium on Cluster Computing and the Grid (CCGrid'05) - Volume 2 - Volume 02
Distributed Systems: Principles and Paradigms (2nd Edition)

Distributed Systems: Principles and Paradigms (2nd Edition)
Cooperative caching: using remote client memory to improve file system performance

OSDI '94 Proceedings of the 1st USENIX conference on Operating Systems Design and Implementation
Collective caching: application-aware client-side file caching

HPDC '05 Proceedings of the High Performance Distributed Computing, 2005. HPDC-14. Proceedings. 14th IEEE International Symposium

Scaling parallel I/O performance through I/O delegate and caching system

Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Improving Parallel Write by Node-Level Request Scheduling

CCGRID '09 Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid
Automatic Run-time Parallelization and Transformation of I/O

Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

Typical large-scale scientific applications periodically write checkpoint files to save the computational state throughout execution. Existing parallel file systems improve such write-only I/O patterns through the use of client-side file caching and write-behind strategies. In distributed environments where files are rarely accessed by more than one client concurrently, file caching has achieved significant success; however, in parallel applications where multiple clients manipulate a shared file, cache coherence control can serialize I/O. We have designed a thread based caching layer for the MPI I/O library, which adds a portable caching system closer to user applications so more information about the application's I/O patterns is available for better coherence control. We demonstrate the impact of our caching solution on parallel write performance with a comprehensive evaluation that includes a set of widely used I/O benchmarks and production application I/O kernels.