On implementing MPI-IO portably and with high performance
Proceedings of the sixth workshop on I/O in parallel and distributed systems
Optimizing noncontiguous accesses in MPI – IO
Parallel Computing
MPI-IO/GPFS, an optimized implementation of MPI-IO on top of GPFS
Proceedings of the 2001 ACM/IEEE conference on Supercomputing
Improving Collective I/O Performance Using Threads
IPPS '99/SPDP '99 Proceedings of the 13th International Symposium on Parallel Processing and the 10th Symposium on Parallel and Distributed Processing
Improving MPI-IO Output Performance with Active Buffering Plus Threads
IPDPS '03 Proceedings of the 17th International Symposium on Parallel and Distributed Processing
View-Based Collective I/O for MPI-IO
CCGRID '08 Proceedings of the 2008 Eighth IEEE International Symposium on Cluster Computing and the Grid
Implementation and Evaluation of File Write-Back and Prefetching for MPI-IO Over GPFS
International Journal of High Performance Computing Applications
Evaluating I/O characteristics and methods for storing structured scientific data
IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Hi-index | 0.00 |
In a large scale of parallel computing, parallel-I/O libraries have an important role to operate huge scale of file accesses from many clients. ROMIO, which is one of the MPI-IO libraries, realizes several optimization schemes such as Two-Phase I/O to have high I/O throughput. Two-Phase I/O mitigates performance degradation due to noncontiguous file accesses by arranging contiguous access patterns as long as possible. Here repetitive operations of I/O and data exchanging phases are carried out. However, they are operated in a consecutive manner. Our approach addresses on pipelined scheme in order to overlap them. Thus, concurrent utilization of the network and I/O subsystems are available. In this paper, we describe our multithreaded and asynchronous protocols. Through performance evaluation on two different kinds of file systems, we found performance improvement relative to the original Two-Phase. Furthermore, our scheme can minimize memory resource utilization compared with the original one.