Architectural requirements of parallel scientific applications with explicit communication
ISCA '93 Proceedings of the 20th annual international symposium on computer architecture
VAXcluster: a closely-coupled distributed system
ACM Transactions on Computer Systems (TOCS)
Input/output characteristics of scalable parallel applications
Supercomputing '95 Proceedings of the 1995 ACM/IEEE conference on Supercomputing
File-Access Characteristics of Parallel Scientific Workloads
IEEE Transactions on Parallel and Distributed Systems
A study of I/O behavior of perfect benchmarks on a multiprocessor
ISCA '90 Proceedings of the 17th annual international symposium on Computer Architecture
Grid -Based Parallel Data Streaming implemented for the Gyrokinetic Toroidal Code
Proceedings of the 2003 ACM/IEEE conference on Supercomputing
A large-scale study of failures in high-performance computing systems
DSN '06 Proceedings of the International Conference on Dependable Systems and Networks
Ceph: a scalable, high-performance distributed file system
OSDI '06 Proceedings of the 7th symposium on Operating systems design and implementation
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Adaptable, metadata rich IO methods for portable high performance IO
IPDPS '09 Proceedings of the 2009 IEEE International Symposium on Parallel&Distributed Processing
I/O performance challenges at leadership scale
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Efficient object storage journaling in a distributed parallel file system
FAST'10 Proceedings of the 8th USENIX conference on File and storage technologies
Scalable Earthquake Simulation on Petascale Supercomputers
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
Managing Variability in the IO Performance of Petascale Storage Systems
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
Understanding and Improving Computational Science Storage Access through Continuous Characterization
ACM Transactions on Storage (TOS)
EDO: Improving Read Performance for Scientific Applications through Elastic Data Organization
CLUSTER '11 Proceedings of the 2011 IEEE International Conference on Cluster Computing
Enhancing I/O throughput via efficient routing and placement for large-scale parallel file systems
PCCC '11 Proceedings of the 30th IEEE International Performance Computing and Communications Conference
Characterization and modeling of PIDX parallel I/O for performance optimization
SC '13 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis
Hi-index | 0.00 |
Supercomputer I/O loads are often dominated by writes. HPC (High Performance Computing) file systems are designed to absorb these bursty outputs at high bandwidth through massive parallelism. However, the delivered write bandwidth often falls well below the peak. This paper characterizes the data absorption behavior of a center-wide shared Lustre parallel file system on the Jaguar supercomputer. We use a statistical methodology to address the challenges of accurately measuring a shared machine under production load and to obtain the distribution of bandwidth across samples of compute nodes, storage targets, and time intervals. We observe and quantify limitations from competing traffic, contention on storage servers and I/O routers, concurrency limitations in the client compute node operating systems, and the impact of variance (stragglers) on coupled output such as striping. We then examine the implications of our results for application performance and the design of I/O middleware systems on shared supercomputers.