Performance measurement of the concurrent file system of the Intel iPSC/2 hypercube
Journal of Parallel and Distributed Computing - Special issue on parallel I/O systems
Parallel file systems for the IBM SP computers
IBM Systems Journal
Flexibility and performance of parallel file systems
ACM SIGOPS Operating Systems Review
The Vesta parallel file system
ACM Transactions on Computer Systems (TOCS)
Efficient data-parallel files via automatic mode detection
Proceedings of the fourth workshop on I/O in parallel and distributed systems: part of the federated computing research conference
ENWRICH: a compute-processor write caching scheme for parallel file systems
Proceedings of the fourth workshop on I/O in parallel and distributed systems: part of the federated computing research conference
Disk-directed I/O for MIMD multiprocessors
ACM Transactions on Computer Systems (TOCS)
Implementation of collective I/O in the Intel Paragon parallel file system: initial experiences
ICS '97 Proceedings of the 11th international conference on Supercomputing
On implementing MPI-IO portably and with high performance
Proceedings of the sixth workshop on I/O in parallel and distributed systems
The impact of spatial layout of jobs on parallel I/O performance
Proceedings of the sixth workshop on I/O in parallel and distributed systems
Informed prefetching of collective input/output requests
SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
A case for using MPI's derived datatypes to improve I/O performance
SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
PDS/PIO: lightweight libraries for collective parallel I/O
SC '98 Proceedings of the 1998 ACM/IEEE conference on Supercomputing
Parallel simulation of parallel file systems and I/O programs
SC '97 Proceedings of the 1997 ACM/IEEE conference on Supercomputing
ICS '02 Proceedings of the 16th international conference on Supercomputing
Placement of I/O servers to improve parallel I/O performance on switch-based clusters
ICS '03 Proceedings of the 17th annual international conference on Supercomputing
Sourcebook of parallel computing
An Adaptive Cache Coherence Protocol Specification for Parallel Input/Output Systems
IEEE Transactions on Parallel and Distributed Systems
A study of I/O methods for parallel visualization of large-scale data
Parallel Computing - Parallel graphics and visualization
Adaptive parallel I/O scheduling algorithm for multiprogrammed systems
Future Generation Computer Systems - Parallel input/output management techniques (PIOMT) in cluster and grid computing
Disk-directed I/O for MIMD multiprocessors
OSDI '94 Proceedings of the 1st USENIX conference on Operating Systems Design and Implementation
Noncontiguous locking techniques for parallel file systems
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Scaling parallel I/O performance through I/O delegate and caching system
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Optimizing server placement for parallel I/O in switch-based clusters
Journal of Parallel and Distributed Computing
Improving Parallel Write by Node-Level Request Scheduling
CCGRID '09 Proceedings of the 2009 9th IEEE/ACM International Symposium on Cluster Computing and the Grid
Adaptive parallel I/O scheduling algorithm for multiprogrammed systems
Future Generation Computer Systems - Parallel input/output management techniques (PIOMT) in cluster and grid computing
An expandable parallel file system using NFS servers
VECPAR'02 Proceedings of the 5th international conference on High performance computing for computational science
Automated tracing of I/O stack
EuroMPI'10 Proceedings of the 17th European MPI users' group meeting conference on Recent advances in the message passing interface
Design and implementation of parallel file aggregation mechanism
Proceedings of the 1st International Workshop on Runtime and Operating Systems for Supercomputers
A new i/o architecture for improving the performance in large scale clusters
ICCSA'06 Proceedings of the 2006 international conference on Computational Science and Its Applications - Volume Part V
Compression-aware I/O performance analysis for big data clustering
Proceedings of the 1st International Workshop on Big Data, Streams and Heterogeneous Source Mining: Algorithms, Systems, Programming Models and Applications
Efficient data restructuring and aggregation for I/O acceleration in PIDX
SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Scalable in situ scientific data encoding for analytical query processing
Proceedings of the 22nd international symposium on High-performance parallel and distributed computing
Optimized process placement for collective I/O operations
Proceedings of the 20th European MPI Users' Group Meeting
Insights for exascale IO APIs from building a petascale IO API
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Characterization and modeling of PIDX parallel I/O for performance optimization
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Scalable model of parallel computations for applications with intensive input-output
Journal of Computer and Systems Sciences International
Hi-index | 0.00 |
As scientists expand their models to describe physical phenomena of increasingly large extent, I/O becomes crucial and a system with limited I/O capacity can severely constrain the performance of the entire program.We provide experimental results, performed on an lntel Touchtone Delta and nCUBE 2 I/O system, to show that the performance of existing parallel I/O systems can vary by several orders of magnitude as a function of the data access pattern of the parallel program. We then propose a two-phase access strategy, to be implemented in a runtime system, in which the data distribution on computational nodes is decoupled from storage distribution. Our experimental results show that performance improvements of several orders of magnitude over direct access based data distribution methods can be obtained, and that performance for most data access patterns can be improved to within a factor of 2 of the best performance. Further, the cost of redistribution is a very small fraction of the overall access cost.