Trace profiling: Scalable event tracing on high-end parallel systems

Authors:
Kathryn Mohror;Karen L. Karavanic
Affiliations:
Portland State University, Computer Science Department, P.O. Box 751, Portland, OR 97207-0751, United States;Portland State University, Computer Science Department, P.O. Box 751, Portland, OR 97207-0751, United States
Venue:
Parallel Computing
Year:
2012

Citing 37
Cited 0

A probe effect in concurrent programs

Software—Practice & Experience
What to draw? When to draw?: an essay on parallel program visualization

Journal of Parallel and Distributed Computing - Special issue on tools and methods for visualization of parallel systems and computations
Dynamic statistical profiling of communication activity in distributed applications

SIGMETRICS '02 Proceedings of the 2002 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
A Unified Trace Environment for IBM SP Systems

IEEE Parallel & Distributed Technology: Systems & Technology
The Visual Display of Parallel Performance Data

Computer
Application-Dependent Dynamic Monitoring of Distributed and Parallel Systems

IEEE Transactions on Parallel and Distributed Systems
Visualization of Do-Loop Performance

HPCN Europe '97 Proceedings of the International Conference and Exhibition on High-Performance Computing and Networking
A Dynamic Periodicity Detector: Application to Speedup Computation

IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
Efficient Time Series Matching by Wavelets

ICDE '99 Proceedings of the 15th International Conference on Data Engineering
A Trace-Scaling Agent for Parallel Application Tracing

ICTAI '02 Proceedings of the 14th IEEE International Conference on Tools with Artificial Intelligence
Monitoring Strategies for Hypercube Systems

PDP '96 Proceedings of the 4th Euromicro Workshop on Parallel and Distributed Processing (PDP '96)
Systematic Assessment of the Overhead of Tracing Parallel Programs

PDP '96 Proceedings of the 4th Euromicro Workshop on Parallel and Distributed Processing (PDP '96)
An Algebra for Cross-Experiment Performance Analysis

ICPP '04 Proceedings of the 2004 International Conference on Parallel Processing
High Performance Event Trace Visualization

PDP '05 Proceedings of the 13th Euromicro Conference on Parallel, Distributed and Network-Based Processing
The Case of the Missing Supercomputer Performance: Achieving Optimal Performance on the 8,192 Processors of ASCI Q

Proceedings of the 2003 ACM/IEEE conference on Supercomputing
Data Mining: Concepts and Techniques

Data Mining: Concepts and Techniques
Low Overhead High Performance Runtime Monitoring of Collective Communication

ICPP '05 Proceedings of the 2005 International Conference on Parallel Processing
Toward Scalable Performance Visualization with Jumpshot

International Journal of High Performance Computing Applications
The Tau Parallel Performance System

International Journal of High Performance Computing Applications
MPI performance analysis tools on Blue Gene/L

Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Measuring and characterizing system behavior using kernel-level event logging

ATEC '00 Proceedings of the annual conference on USENIX Annual Technical Conference
A test suite for parallel performance analysis tools: Research Articles

Concurrency and Computation: Practice & Experience - European–American Working Group on Automatic Performance Analysis (APART)
Automatic analysis of inefficiency patterns in parallel applications: Research Articles

Concurrency and Computation: Practice & Experience - European–American Working Group on Automatic Performance Analysis (APART)
Preserving time in large-scale communication traces

Proceedings of the 22nd annual international conference on Supercomputing
Scalable load-balance measurement for SPMD codes

Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Evaluating similarity-based trace reduction techniques for scalable performance analysis

Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
Performance feature identification by comparative trace analysis

Future Generation Computer Systems
Scaling molecular dynamics to 3000 processors with projections: a performance analysis case study

ICCS'03 Proceedings of the 2003 international conference on Computational science
A performance prediction framework for scientific applications

ICCS'03 Proceedings of the 2003 international conference on Computational science: PartIII
A new data compression technique for event based program traces

ICCS'03 Proceedings of the 2003 international conference on Computational science: PartIII
Scalable event trace visualization

Euro-Par'09 Proceedings of the 2009 international conference on Parallel processing
A systematic multi-step methodology for performance analysis of communication traces of distributed applications based on hierarchical clustering

IPDPS'06 Proceedings of the 20th international conference on Parallel and distributed processing
Scalable parallel trace-based performance analysis

EuroPVM/MPI'06 Proceedings of the 13th European PVM/MPI User's Group conference on Recent advances in parallel virtual machine and message passing interface
Model-Based relative performance diagnosis of wavefront parallel computations

HPCC'06 Proceedings of the Second international conference on High Performance Computing and Communications
Trace-based parallel performance overhead compensation

HPCC'05 Proceedings of the First international conference on High Performance Computing and Communications
Introducing the open trace format (OTF)

ICCS'06 Proceedings of the 6th international conference on Computational Science - Volume Part II
Towards scalable event tracing for high end systems

HPCC'07 Proceedings of the Third international conference on High Performance Computing and Communications

Quantified Score

Hi-index	0.00

Visualization

Abstract

Accurate performance analysis of high end systems requires event-based traces to correctly identify the root cause of a number of the complex performance problems that arise on these highly parallel systems. These high-end architectures contain tens to hundreds of thousands of processors, pushing application scalability challenges to new heights. Unfortunately, the collection of event-based data presents scalability challenges itself: the large volume of collected data increases tool overhead, and results in data files that are difficult to store and analyze. Our solution to these problems is a new measurement technique called trace profiling that collects the information needed to diagnose performance problems that traditionally require traces, but at a greatly reduced data volume. The trace profiling technique reduces the amount of data stored by capitalizing on the repeated behavior of programs, and on the similarity of the behavior and performance of parallel processes in an application run. Trace profiling is a hybrid between profiling and tracing, collecting summary information about the event patterns in an application run. Because the data has already been classified into behavior categories, we can present reduced, partially analyzed performance data to the user, highlighting the performance behaviors that comprised most of the execution time.