Cashing in on hints for better prefetching and caching in PVFS and MPI-IO

  • Authors:
  • Christina M. Patrick;Mahmut Kandemir;Mustafa Karaköy;Seung Woo Son;Alok Choudhary

  • Affiliations:
  • Pennsylvania State University;Pennsylvania State University;Imperial College;Argonne National Laboratory;Northwestern University

  • Venue:
  • Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this work, we propose, implement and test a novel approach to the management of parallel I/O in high-performance computing. Our proposed approach is built upon three complementary ideas: (i) allowing users to place hints into the application code indicating high-level data access patterns, (ii) enabling an optimizing compiler to process these hints and develop I/O optimization strategies, and (iii) enhancing the I/O stack to accept these optimizations and process them across the different layers in the stack. We describe a general hint processing framework that accommodates this approach and demonstrate its potential by applying it to two sample problems: (i) shared storage cache management and (ii) I/O prefetching. In the former, our approach decides, at each program point of interest, the ideal set of data blocks to keep in shared storage caches in the I/O stack, and in the latter, the high-level data access pattern is propagated from application layer to the parallel file system layer for prefetching data from the storage subsystem. Our approach is designed to complement and work synergistically with the MPI-IO and PVFS frameworks and exploits the characteristics of applications written using these software. We tested our approach using both synthetic data access patterns and disk I/O intensive application programs. The results collected indicate that the proposed approach improves over existing storage caching and I/O prefetching schemes by 28% and 66%, respectively.