Compiler optimizations for Fortran D on MIMD distributed-memory machines
Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Performance analysis on a CC-NUMA prototype
IBM Journal of Research and Development - Special issue: performance analysis and its impact on design
Simplification of array access patterns for compiler optimizations
PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
On implementing MPI-IO portably and with high performance
Proceedings of the sixth workshop on I/O in parallel and distributed systems
Orthogonal Striping and Mirroring in Distributed RAID for I/O-Centric Cluster Computing
IEEE Transactions on Parallel and Distributed Systems
A Loop Transformation Theory and an Algorithm to Maximize Parallelism
IEEE Transactions on Parallel and Distributed Systems
Learning to Classify Parallel Input/Output Access Patterns
IEEE Transactions on Parallel and Distributed Systems
Exploiting Inter-File Access Patterns Using Multi-Collective I/O
FAST '02 Proceedings of the Conference on File and Storage Technologies
Workload Characterization of Input/Output Intensive Parallel Applications
Proceedings of the 9th International Conference on Computer Performance Evaluation: Modelling Techniques and Tools
Profile-guided I/O partitioning
ICS '03 Proceedings of the 17th annual international conference on Supercomputing
Data Sieving and Collective I/O in ROMIO
FRONTIERS '99 Proceedings of the The 7th Symposium on the Frontiers of Massively Parallel Computation
RAPID-Cache ¾ A Reliable and Inexpensive Write Cache for Disk I/O Systems
HPCA '99 Proceedings of the 5th International Symposium on High Performance Computer Architecture
Proceedings of the 2007 ACM SIGPLAN/SIGBED conference on Languages, compilers, and tools for embedded systems
Code transformation and instruction set extension
ACM Transactions on Embedded Computing Systems (TECS)
Light-Weight parallel i/o analysis at scale
EPEW'11 Proceedings of the 8th European conference on Computer Performance Engineering
Hi-index | 0.00 |
The main goal for parallel I/O is to increase I/O parallelism by providing multiple, independent data channels between processors and disks. To realize this goal, I/O streams need to be parallelized and partitioned at multiple system layers. Contention at any level can dramatically decrease performance and limit scalability. To address this disk contention bottleneck, it is important to carefully study disk access patterns.From our previous work on I/O profiling, we found that I/O access patterns of parallel scientific applications are usually very regular and highly predictable. Thus it is possible to detect I/O access patterns statically during compiler time. Large datasets are logically linearized in file space on disk, and these intensive data accesses follow a linear space traversal. In this paper, we present our recent work on compiler-directed I/O partitioning, based on Linear Disk Access Descriptors (LDAD). We use the SUIF compiler infrastructure to perform data-flow analysis and recognize LDADs. We then use these LDADs to guide our I/O data partitioning that utilizes multiple disks to significantly increase I/O throughput.