Concurrency control and recovery in database systems
Concurrency control and recovery in database systems
Improved parallel I/O via a two-phase run-time access strategy
ACM SIGARCH Computer Architecture News - Special issue on input/output in parallel computer systems
Framework for optimizing parallel I/O
Framework for optimizing parallel I/O
GPFS: A Shared-Disk File System for Large Computing Clusters
FAST '02 Proceedings of the Conference on File and Storage Technologies
An Abstract-Device Interface for Implementing Portable Parallel-I/O Interfaces
FRONTIERS '96 Proceedings of the 6th Symposium on the Frontiers of Massively Parallel Computation
Collective Buffering: Improving Parallel I/O Performance
HPDC '97 Proceedings of the 6th IEEE International Symposium on High Performance Distributed Computing
Implementing MPI-IO atomic mode without file system support
CCGRID '05 Proceedings of the Fifth IEEE International Symposium on Cluster Computing and the Grid (CCGrid'05) - Volume 2 - Volume 02
Collective caching: application-aware client-side file caching
HPDC '05 Proceedings of the High Performance Distributed Computing, 2005. HPDC-14. Proceedings. 14th IEEE International Symposium
Using MPI file caching to improve parallel write performance for large-scale scientific applications
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
DataStager: scalable data staging services for petascale applications
Proceedings of the 18th ACM international symposium on High performance distributed computing
DataStager: scalable data staging services for petascale applications
Cluster Computing
Managing Variability in the IO Performance of Petascale Storage Systems
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
PetaShare: A reliable, efficient and transparent distributed storage management system
Scientific Programming
Just in time: adding value to the IO pipelines of high performance applications with JITStaging
Proceedings of the 20th international symposium on High performance distributed computing
Six degrees of scientific data: reading patterns for extreme scale science IO
Proceedings of the 20th international symposium on High performance distributed computing
Topology-aware data movement and staging for I/O acceleration on Blue Gene/P supercomputing systems
Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis
Examples of in transit visualization
Proceedings of the 2nd international workshop on Petascal data analytics: challenges and opportunities
Towards scalable I/O architecture for exascale systems
Proceedings of the 2011 ACM international workshop on Many task computing on grids and supercomputers
Extending scalability of collective IO through nessie and staging
Proceedings of the sixth workshop on Parallel Data Storage
Enabling event tracing at leadership-class scale through I/O forwarding middleware
Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing
ISOBAR hybrid compression-I/O interleaving for large-scale parallel I/O optimization
Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing
I/O threads to reduce checkpoint blocking for an electromagnetics solver on Blue Gene/P and Cray XK6
Proceedings of the 2nd International Workshop on Runtime and Operating Systems for Supercomputers
Using MPI in high-performance computing services
Proceedings of the 20th European MPI Users' Group Meeting
Optimizing I/O forwarding techniques for extreme-scale event tracing
Cluster Computing
Hi-index | 0.00 |
Increasingly complex scientific applications require massive parallelism to achieve the goals of fidelity and high computational performance. Such applications periodically offload checkpointing data to file system for post-processing and program resumption. As a side effect of high degree of parallelism, I/O contention at servers doesn't allow overall performance to scale with increasing number of processors. To bridge the gap between parallel computational and I/O performance, we propose a portable MPI-IO layer where certain tasks, such as file caching, consistency control, and collective I/O optimization are delegated to a small set of compute nodes, collectively termed as I/O Delegate nodes. A collective cache design is incorporated to resolve cache coherence and hence alleviates the lock contention at I/O servers. By using popular parallel I/O benchmark and application I/O kernels, our experimental evaluation indicates considerable performance improvement with a small percentage of compute resources reserved for I/O.