Towards scalable I/O architecture for exascale systems
Proceedings of the 2011 ACM international workshop on Many task computing on grids and supercomputers
Can checkpoint/restart mechanisms benefit from hierarchical data staging?
Euro-Par'11 Proceedings of the 2011 international conference on Parallel Processing - Volume 2
A 1 PB/s file system to checkpoint three million MPI tasks
Proceedings of the 22nd international symposium on High-performance parallel and distributed computing
Memory-conscious collective I/O for extreme scale HPC systems
Proceedings of the 3rd International Workshop on Runtime and Operating Systems for Supercomputers
A hierarchical parallel storage system based on distributed memory for large scale systems
Proceedings of the 20th European MPI Users' Group Meeting
Hi-index | 0.00 |
Parallel applications currently suffer from a significant imbalance between computational power and available I/O bandwidth. Additionally, the hierarchical organization of current Petascale systems contributes to an increase of the I/O subsystem latency. In these hierarchies, file access involves pipelining data through several networks with incremental latencies and higher probability of congestion. Future Exascale systems are likely to share this trait. This paper presents a scalable parallel I/O software system designed to transparently hide the latency of file system accesses to applications on these platforms. Our solution takes advantage of the hierarchy of networks involved in file accesses, to maximize the degree of overlap between computation, file I/O-related communication, and file system access. We describe and evaluate a two-level hierarchy for Blue Gene systems consisting of client-side and I/O node-side caching. Our file cache management modules coordinate the data staging between application and storage through the Blue Gene networks. The experimental results demonstrate that our architecture achieves significant performance improvements through a high degree of overlap between computation, communication, and file I/O.