Optimizing I/O forwarding techniques for extreme-scale event tracing

Authors:
Thomas Ilsche;Joseph Schuchart;Jason Cope;Dries Kimpe;Terry Jones;Andreas Knüpfer;Kamil Iskra;Robert Ross;Wolfgang E. Nagel;Stephen Poole
Affiliations:
Technische Universität Dresden (ZIH), Dresden, Germany 01062;Oak Ridge National Laboratory, Oak Ridge, USA 37831;Argonne National Laboratory, Argonne, USA 60439;Argonne National Laboratory, Argonne, USA 60439;Oak Ridge National Laboratory, Oak Ridge, USA 37831;Technische Universität Dresden (ZIH), Dresden, Germany 01062;Argonne National Laboratory, Argonne, USA 60439;Argonne National Laboratory, Argonne, USA 60439;Technische Universität Dresden (ZIH), Dresden, Germany 01062;Oak Ridge National Laboratory, Oak Ridge, USA 37831
Venue:
Cluster Computing
Year:
2014

Citing 28
Cited 0

Noncontiguous I/O Accesses Through MPI-IO

CCGRID '03 Proceedings of the 3st International Symposium on Cluster Computing and the Grid
Cplant" Runtime System Support for Multi-Processor and Heterogeneous Compute Nodes

CLUSTER '02 Proceedings of the IEEE International Conference on Cluster Computing
The Google file system

SOSP '03 Proceedings of the nineteenth ACM symposium on Operating systems principles
Improving the Scalability of Parallel Jobs by adding Parallel Awareness to the Operating System

Proceedings of the 2003 ACM/IEEE conference on Supercomputing
BMI: A Network Abstraction Layer for Parallel I/O

IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Workshop 9 - Volume 10
Exporting Storage Systems in a Scalable Manner with pNFS

MSST '05 Proceedings of the 22nd IEEE / 13th NASA Goddard Conference on Mass Storage Systems and Technologies
The Tau Parallel Performance System

International Journal of High Performance Computing Applications
PVFS: a parallel file system for linux clusters

ALS'00 Proceedings of the 4th annual Linux Showcase & Conference - Volume 4
ZOID: I/O-forwarding infrastructure for petascale architectures

Proceedings of the 13th ACM SIGPLAN Symposium on Principles and practice of parallel programming
DART: a substrate for high speed asynchronous data IO

HPDC '08 Proceedings of the 17th international symposium on High performance distributed computing
Flexible IO and integration for scientific codes through the adaptable IO system (ADIOS)

CLADE '08 Proceedings of the 6th international workshop on Challenges of large applications in distributed environments
Scaling parallel I/O performance through I/O delegate and caching system

Proceedings of the 2008 ACM/IEEE conference on Supercomputing
DataStager: scalable data staging services for petascale applications

Proceedings of the 18th ACM international symposium on High performance distributed computing
A Holistic Approach for Performance Measurement and Analysis for Petascale Applications

ICCS 2009 Proceedings of the 9th International Conference on Computational Science
Visual Analysis of Inter-Process Communication for Large-Scale Parallel Computing

IEEE Transactions on Visualization and Computer Graphics
A configurable algorithm for parallel image-compositing applications

Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Scalable massively parallel I/O to task-local files

Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
PLFS: a checkpoint filesystem for parallel applications

Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
HPCTOOLKIT: tools for performance analysis of optimized parallel programs http://hpctoolkit.org

Concurrency and Computation: Practice & Experience - Scalable Tools for High-End Computing
Accelerating I/O Forwarding in IBM Blue Gene/P Systems

Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
Optimization Techniques at the I/O Forwarding Layer

CLUSTER '10 Proceedings of the 2010 IEEE International Conference on Cluster Computing
Performance and Scalability Evaluation of 'Big Memory' on Blue Gene Linux

International Journal of High Performance Computing Applications
Apache hadoop goes realtime at Facebook

Proceedings of the 2011 ACM SIGMOD International Conference on Management of data
Scalable fine-grained call path tracing

Proceedings of the international conference on Supercomputing
Visual analysis of I/O system behavior for high-end computing

Proceedings of the third international workshop on Large-scale system and application performance
Bridging HPC and grid file i/o with IOFSL

PARA'10 Proceedings of the 10th international conference on Applied Parallel and Scientific Computing - Volume 2
Enabling event tracing at leadership-class scale through I/O forwarding middleware

Proceedings of the 21st international symposium on High-Performance Parallel and Distributed Computing
On the usability of the MPI shared file pointer routines

EuroMPI'12 Proceedings of the 19th European conference on Recent Advances in the Message Passing Interface

Quantified Score

Hi-index	0.00

Visualization

Abstract

Programming development tools are a vital component for understanding the behavior of parallel applications. Event tracing is a principal ingredient to these tools, but new and serious challenges place event tracing at risk on extreme-scale machines. As the quantity of captured events increases with concurrency, the additional data can overload the parallel file system and perturb the application being observed. In this work we present a solution for event tracing on extreme-scale machines. We enhance an I/O forwarding software layer to aggregate and reorganize log data prior to writing to the storage system, significantly reducing the burden on the underlying file system. Furthermore, we introduce a sophisticated write buffering capability to limit the impact. To validate the approach, we employ the Vampir tracing toolset using these new capabilities. Our results demonstrate that the approach increases the maximum traced application size by a factor of 5脳 to more than 200,000 processes.