From trace generation to visualization: a performance framework for distributed parallel systems
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
IPS-2: The Second Generation of a Parallel Program Measurement System
IEEE Transactions on Parallel and Distributed Systems
On the Scalability of Tracing Mechanisms
Euro-Par '02 Proceedings of the 8th International Euro-Par Conference on Parallel Processing
DiP: A Parallel Program Development Environment
Euro-Par '96 Proceedings of the Second International Euro-Par Conference on Parallel Processing-Volume II
EARL - A Programmable and Extensible Toolkit for Analyzing Event Traces of Message Passing Programs
HPCN Europe '99 Proceedings of the 7th International Conference on High-Performance Computing and Networking
Automatic performance analysis of hybrid MPI/OpenMP applications
Journal of Systems Architecture: the EUROMICRO Journal - Special issue: Evolutions in parallel distributed and network-based processing
Construction and Compression of Complete Call Graphs for Post-Mortem Program Trace Analysis
ICPP '05 Proceedings of the 2005 International Conference on Parallel Processing
PARA'06 Proceedings of the 8th international conference on Applied parallel computing: state of the art in scientific computing
Scalable parallel trace-based performance analysis
EuroPVM/MPI'06 Proceedings of the 13th European PVM/MPI User's Group conference on Recent advances in parallel virtual machine and message passing interface
PARA'06 Proceedings of the 8th international conference on Applied parallel computing: state of the art in scientific computing
Performance simulation of non-blocking communication in message-passing applications
Euro-Par'09 Proceedings of the 2009 international conference on Parallel processing
Extending the scope of the controlled logical clock
Cluster Computing
Hi-index | 0.00 |
Automatic trace analysis is an effective method of identifying complex performance phenomena in parallel applications. To simplify the development of complex trace-analysis algorithms, the earl library interface offers high-level access to individual events contained in a global trace file. However, as the size of parallel systems grows further and the number of processors used by individual applications is continuously raised, the traditional approach of analyzing a single global trace file becomes increasingly constrained by the large number of events. To enable scalable trace analysis, we present a new design of the aforementioned earl interface that accesses multiple local trace files in parallel while offering means to conveniently exchange events between processes. This article describes the modified view of the trace data as well as related programming abstractions provided by the new pearl library interface and discusses its application in performance analysis.