Throttling I/O streams to accelerate file-IO performance

Authors:
Seetharami Seelam;Andre Kerstens;Patricia J. Teller
Affiliations:
IBM Research, Yorktown Heights, NY;SGI, Sunnyvale, CA;The University of Texas at El Paso, El Paso, TX
Venue:
HPCC'07 Proceedings of the Third international conference on High Performance Computing and Communications
Year:
2007

Citing 12
Cited 0

RAID: high-performance, reliable secondary storage

ACM Computing Surveys (CSUR)
Input/output characteristics of scalable parallel applications

Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Automatic parallel I/O performance optimization in Panda

Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures
Exploiting global input/output access pattern classification

SC '97 Proceedings of the 1997 ACM/IEEE conference on Supercomputing
Software caching vs. prefetching

Proceedings of the 3rd international symposium on Memory management
Data Sieving and Collective I/O in ROMIO

FRONTIERS '99 Proceedings of the The 7th Symposium on the Frontiers of Massively Parallel Computation
Flexible and Efficient Parallel I/O for Large-Scale Multi-Component Simulations

IPDPS '03 Proceedings of the 17th International Symposium on Parallel and Distributed Processing
Asynchronous Scheduling of Redundant Disk Arrays

IEEE Transactions on Computers
Issues and Challenges in the Performance Analysis of Real Disk Arrays

IEEE Transactions on Parallel and Distributed Systems
Integrated Performance Monitoring of a Cosmology Application on Leading HEC Platforms

ICPP '05 Proceedings of the 2005 International Conference on Parallel Processing
Rate-Controlled Scheduling of Expired Writes for Volatile Caches

QEST '06 Proceedings of the 3rd international conference on the Quantitative Evaluation of Systems
Performance characteristics of a cosmology package on leading HPC architectures

HiPC'04 Proceedings of the 11th international conference on High Performance Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

To increase the scale and performance of high-performance computing (HPC) applications, it is common to distribute computation across multiple processors. Often without realizing it, file I/O is parallelized with the computation. An implication of this is that multiple compute tasks are likely to concurrently access the I/O nodes of an HPC system. When a large number of I/O streams concurrently access an I/O node, I/O performance tends to degrade, impacting application execution time. This paper presents experimental results that show that controlling the number of file-I/O streams that concurrently access an I/O node can enhance application performance. We call this mechanism file-I/O stream throttling. The paper (1) describes this mechanism and demonstrates how it can be implemented either at the application or system software layers, and (2) presents results of experiments driven by the cosmology application benchmark MADbench, executed on a variety of computing systems, that demonstrate the effectiveness of file-I/O stream throttling. The I/O pattern of MADbench resembles that of a large class of HPC applications.