RAID: high-performance, reliable secondary storage
ACM Computing Surveys (CSUR)
Input/output characteristics of scalable parallel applications
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
Automatic parallel I/O performance optimization in Panda
Proceedings of the tenth annual ACM symposium on Parallel algorithms and architectures
Exploiting global input/output access pattern classification
SC '97 Proceedings of the 1997 ACM/IEEE conference on Supercomputing
Software caching vs. prefetching
Proceedings of the 3rd international symposium on Memory management
Data Sieving and Collective I/O in ROMIO
FRONTIERS '99 Proceedings of the The 7th Symposium on the Frontiers of Massively Parallel Computation
Flexible and Efficient Parallel I/O for Large-Scale Multi-Component Simulations
IPDPS '03 Proceedings of the 17th International Symposium on Parallel and Distributed Processing
Asynchronous Scheduling of Redundant Disk Arrays
IEEE Transactions on Computers
Issues and Challenges in the Performance Analysis of Real Disk Arrays
IEEE Transactions on Parallel and Distributed Systems
Integrated Performance Monitoring of a Cosmology Application on Leading HEC Platforms
ICPP '05 Proceedings of the 2005 International Conference on Parallel Processing
Rate-Controlled Scheduling of Expired Writes for Volatile Caches
QEST '06 Proceedings of the 3rd international conference on the Quantitative Evaluation of Systems
Performance characteristics of a cosmology package on leading HPC architectures
HiPC'04 Proceedings of the 11th international conference on High Performance Computing
Hi-index | 0.00 |
To increase the scale and performance of high-performance computing (HPC) applications, it is common to distribute computation across multiple processors. Often without realizing it, file I/O is parallelized with the computation. An implication of this is that multiple compute tasks are likely to concurrently access the I/O nodes of an HPC system. When a large number of I/O streams concurrently access an I/O node, I/O performance tends to degrade, impacting application execution time. This paper presents experimental results that show that controlling the number of file-I/O streams that concurrently access an I/O node can enhance application performance. We call this mechanism file-I/O stream throttling. The paper (1) describes this mechanism and demonstrates how it can be implemented either at the application or system software layers, and (2) presents results of experiments driven by the cosmology application benchmark MADbench, executed on a variety of computing systems, that demonstrate the effectiveness of file-I/O stream throttling. The I/O pattern of MADbench resembles that of a large class of HPC applications.