Hiding I/O latency with pre-execution prefetching for parallel applications

  • Authors:
  • Yong Chen;Surendra Byna;Xian-He Sun;Rajeev Thakur;William Gropp

  • Affiliations:
  • Illinois Institute of Technology, Chicago, IL;Illinois Institute of Technology, Chicago, IL;Illinois Institute of Technology, Chicago, IL;Argonne National Laboratory, Argonne, IL;University of Illinois Urbana-Champaign, Urbana, IL

  • Venue:
  • Proceedings of the 2008 ACM/IEEE conference on Supercomputing
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Parallel applications are usually able to achieve high computational performance but suffer from large latency in I/O accesses. I/O prefetching is an effective solution for masking the latency. Most of existing I/O prefetching techniques, however, are conservative and their effectiveness is limited by low accuracy and coverage. As the processor-I/O performance gap has been increasing rapidly, data-access delay has become a dominant performance bottleneck. We argue that it is time to revisit the "I/O wall" problem and trade the excessive computing power with data-access speed. We propose a novel pre-execution approach for masking I/O latency. We describe the pre-execution I/O prefetching framework, the pre-execution thread construction methodology, the underlying library support, and the prototype implementation in the ROMIO MPI-IO implementation in MPICH2. Preliminary experiments show that the pre-execution approach is promising in reducing I/O access latency and has real potential.