A case for redundant arrays of inexpensive disks (RAID)
SIGMOD '88 Proceedings of the 1988 ACM SIGMOD international conference on Management of data
Compiler and runtime support for out-of-core HPF programs
ICS '94 Proceedings of the 8th international conference on Supercomputing
Striping in a RAID level 5 disk array
Proceedings of the 1995 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Informed prefetching and caching
SOSP '95 Proceedings of the fifteenth ACM symposium on Operating systems principles
Improving data locality with loop transformations
ACM Transactions on Programming Languages and Systems (TOPLAS)
File-Access Characteristics of Parallel Scientific Workloads
IEEE Transactions on Parallel and Distributed Systems
Performance analysis on a CC-NUMA prototype
IBM Journal of Research and Development - Special issue: performance analysis and its impact on design
Lessons from characterizating the input/output behavior of parallel scientific applications
Performance Evaluation - Special issue on tools for performance evaluation
Cache-conscious structure layout
Proceedings of the ACM SIGPLAN 1999 conference on Programming language design and implementation
On implementing MPI-IO portably and with high performance
Proceedings of the sixth workshop on I/O in parallel and distributed systems
A study of I/O behavior of perfect benchmarks on a multiprocessor
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Compiler-based I/O prefetching for out-of-core applications
ACM Transactions on Computer Systems (TOCS)
Learning to Classify Parallel Input/Output Access Patterns
IEEE Transactions on Parallel and Distributed Systems
Optimization of Out-of-Core Computations Using Chain Vectors
Euro-Par '97 Proceedings of the Third International Euro-Par Conference on Parallel Processing
Data Sieving and Collective I/O in ROMIO
FRONTIERS '99 Proceedings of the The 7th Symposium on the Frontiers of Massively Parallel Computation
Disk-directed I/O for an Out-of-Core Computation
Disk-directed I/O for an Out-of-Core Computation
Source level transformations to improve I/O data partitioning
SNAPI '03 Proceedings of the international workshop on Storage network architecture and parallel I/Os
Developing object-oriented parallel iterative methods
International Journal of High Performance Computing and Networking
InterferenceRemoval: removing interference of disk access for MPI programs through data replication
Proceedings of the 24th ACM International Conference on Supercomputing
Histogram-based I/O optimization for visualizing large-scale data
Proceedings of the 2009 Workshop on Ultrascale Visualization
IOrchestrator: Improving the Performance of Multi-node I/O Systems via Inter-Server Coordination
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
A cost-intelligent application-specific data layout scheme for parallel file systems
Proceedings of the 20th international symposium on High performance distributed computing
Pattern-aware file reorganization in MPI-IO
Proceedings of the sixth workshop on Parallel Data Storage
Light-Weight parallel i/o analysis at scale
EPEW'11 Proceedings of the 8th European conference on Computer Performance Engineering
Hi-index | 0.00 |
In the field of high performance computing there is a growing need to process large, complex datasets. Many of these applications are file-intensive workloads, performing a large number of reads from and writes to a small number of files. When executing these workloads on cluster-based systems, performance cannot scale by simply increasing the number of compute nodes. To effectively exploit parallel resources we need to parallelize file I/O. The potential impact of exploiting parallel I/O grows as the gap between CPU and disk speeds continues to increase.While parallel I/O middleware systems (e.g., MPI I/O) provide users with environments where large datasets can be shared among multiple distributed processes, the performance of file-intensive applications depends heavily on how the data is accessed and where the data is physically located on disk. I/O operations need to be parallelized both at the application level (using middleware) and at the disk level (using partitioning).In this paper, we present a new profile-guided greedy partitioning algorithm to parallelize I/O access for file-intensive applications run on cluster-based systems. We are using MPI and MPI I/O to provide parallelization at the application level. We utilize I/O profiling to capture relevant information about the I/O stream. We then use these profiles to guide file partitioning across multiple disks to significantly improve I/O throughput.