Source level transformations to improve I/O data partitioning

  • Authors:
  • Yijian Wang;David Kaeli

  • Affiliations:
  • Northeastern University, Boston, MA;Northeastern University, Boston, MA

  • Venue:
  • SNAPI '03 Proceedings of the international workshop on Storage network architecture and parallel I/Os
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

The main goal for parallel I/O is to increase I/O parallelism by providing multiple, independent data channels between processors and disks. To realize this goal, I/O streams need to be parallelized and partitioned at multiple system layers. Contention at any level can dramatically decrease performance and limit scalability. To address this disk contention bottleneck, it is important to carefully study disk access patterns.From our previous work on I/O profiling, we found that I/O access patterns of parallel scientific applications are usually very regular and highly predictable. Thus it is possible to detect I/O access patterns statically during compiler time. Large datasets are logically linearized in file space on disk, and these intensive data accesses follow a linear space traversal. In this paper, we present our recent work on compiler-directed I/O partitioning, based on Linear Disk Access Descriptors (LDAD). We use the SUIF compiler infrastructure to perform data-flow analysis and recognize LDADs. We then use these LDADs to guide our I/O data partitioning that utilizes multiple disks to significantly increase I/O throughput.