A parallel trace-data interface for scalable performance analysis

Authors:
Markus Geimer;Felix Wolf;Andreas Knüpfer;Bernd Mohr;Brian J. N. Wylie
Affiliations:
John von Neumann Institute for Computing, Forschungszentrum Jülich, Jülich, Germany;John von Neumann Institute for Computing, Forschungszentrum Jülich, Jülich, Germany and Department of Computer Science, RWTH Aachen University, Aachen, Germany;Center for Information Services and High Performance Computing, Dresden University of Technology, Dresden, Germany;John von Neumann Institute for Computing, Forschungszentrum Jülich, Jülich, Germany;John von Neumann Institute for Computing, Forschungszentrum Jülich, Jülich, Germany
Venue:
PARA'06 Proceedings of the 8th international conference on Applied parallel computing: state of the art in scientific computing
Year:
2006

Citing 9
Cited 6

From trace generation to visualization: a performance framework for distributed parallel systems

Proceedings of the 2000 ACM/IEEE conference on Supercomputing
IPS-2: The Second Generation of a Parallel Program Measurement System

IEEE Transactions on Parallel and Distributed Systems
On the Scalability of Tracing Mechanisms

Euro-Par '02 Proceedings of the 8th International Euro-Par Conference on Parallel Processing
DiP: A Parallel Program Development Environment

Euro-Par '96 Proceedings of the Second International Euro-Par Conference on Parallel Processing-Volume II
EARL - A Programmable and Extensible Toolkit for Analyzing Event Traces of Message Passing Programs

HPCN Europe '99 Proceedings of the 7th International Conference on High-Performance Computing and Networking
Automatic performance analysis of hybrid MPI/OpenMP applications

Journal of Systems Architecture: the EUROMICRO Journal - Special issue: Evolutions in parallel distributed and network-based processing
Construction and Compression of Complete Call Graphs for Post-Mortem Program Trace Analysis

ICPP '05 Proceedings of the 2005 International Conference on Parallel Processing
Integrated runtime measurement summarisation and selective event tracing for scalable parallel execution performance diagnosis

PARA'06 Proceedings of the 8th international conference on Applied parallel computing: state of the art in scientific computing
Scalable parallel trace-based performance analysis

EuroPVM/MPI'06 Proceedings of the 13th European PVM/MPI User's Group conference on Recent advances in parallel virtual machine and message passing interface

A scalable tool architecture for diagnosing wait states in massively parallel applications

Parallel Computing
Scalable timestamp synchronization for event traces of message-passing applications

Parallel Computing
Integrated runtime measurement summarisation and selective event tracing for scalable parallel execution performance diagnosis

PARA'06 Proceedings of the 8th international conference on Applied parallel computing: state of the art in scientific computing
Performance simulation of non-blocking communication in message-passing applications

Euro-Par'09 Proceedings of the 2009 international conference on Parallel processing
Extending the scope of the controlled logical clock

Cluster Computing
A scalable infrastructure for the performance analysis of passive target synchronization

Parallel Computing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Automatic trace analysis is an effective method of identifying complex performance phenomena in parallel applications. To simplify the development of complex trace-analysis algorithms, the earl library interface offers high-level access to individual events contained in a global trace file. However, as the size of parallel systems grows further and the number of processors used by individual applications is continuously raised, the traditional approach of analyzing a single global trace file becomes increasingly constrained by the large number of events. To enable scalable trace analysis, we present a new design of the aforementioned earl interface that accesses multiple local trace files in parallel while offering means to conveniently exchange events between processes. This article describes the modified view of the trace data as well as related programming abstractions provided by the new pearl library interface and discusses its application in performance analysis.