Performance analysis of distributed applications using automatic classification of communication inefficiencies

Authors:
Jeffrey Vetter
Affiliations:
Center for Applied Scientific Computing, Lawrence Livermore National Laboratory, Livermore, California
Venue:
Proceedings of the 14th international conference on Supercomputing
Year:
2000

Citing 17
Cited 22

Quartz: a tool for tuning parallel program performance

SIGMETRICS '90 Proceedings of the 1990 ACM SIGMETRICS conference on Measurement and modeling of computer systems
C4.5: programs for machine learning

C4.5: programs for machine learning
Normalized performance indices for message passing parallel programs

ICS '94 Proceedings of the 8th international conference on Supercomputing
Waiting time analysis and performance visualization in Carnival

SPDT '96 Proceedings of the SIGMETRICS symposium on Parallel and distributed tools
Advances in knowledge discovery and data mining

Advances in knowledge discovery and data mining
Performance debugging shared memory parallel programs using run-time dependence analysis

SIGMETRICS '97 Proceedings of the 1997 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Performance measurements for multithreaded programs

SIGMETRICS '98/PERFORMANCE '98 Proceedings of the 1998 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Portable profiling and tracing for parallel, scientific applications using C++

SPDT '98 Proceedings of the SIGMETRICS symposium on Parallel and distributed tools
The grid: blueprint for a new computing infrastructure

The grid: blueprint for a new computing infrastructure
Mining in a data-flow environment: experience in network intrusion detection

KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Using MPI (2nd ed.): portable parallel programming with the message-passing interface

Using MPI (2nd ed.): portable parallel programming with the message-passing interface
Very high resolution simulation of compressible turbulence on the IBM-SP system

SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Designing and Building Parallel Programs: Concepts and Tools for Parallel Software Engineering

Designing and Building Parallel Programs: Concepts and Tools for Parallel Software Engineering
MPI: The Complete Reference

MPI: The Complete Reference
Parallel Performance Visualization: From Practice to Theory

IEEE Parallel & Distributed Technology: Systems & Technology
Medea: A Tool for Workload Characterization of Parallel Systems

IEEE Parallel & Distributed Technology: Systems & Technology
The Paradyn Parallel Performance Measurement Tool

Computer

Statistical scalability analysis of communication operations in distributed applications

PPoPP '01 Proceedings of the eighth ACM SIGPLAN symposium on Principles and practices of parallel programming
Dynamic statistical profiling of communication activity in distributed applications

SIGMETRICS '02 Proceedings of the 2002 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Automatic Search for Performance Problems in Parallel and Distributed Programs by Using Multi-experiment Analysis

HiPC '02 Proceedings of the 9th International Conference on High Performance Computing
Scalable analysis techniques for microprocessor performance counter metrics

Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Automatic performance analysis of hybrid MPI/OpenMP applications

Journal of Systems Architecture: the EUROMICRO Journal - Special issue: Evolutions in parallel distributed and network-based processing
Aksum: a performance analysis tool for parallel and distributed applications

Performance analysis and grid computing
Specifying performance properties of parallel applications using compound events

On-line monitoring systems and computer tool interoperability
Exploring the Energy-Time Tradeoff in MPI Programs on a Power-Scalable Cluster

IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
Analyzing the Energy-Time Trade-Off in High-Performance Computing Applications

IEEE Transactions on Parallel and Distributed Systems
Scalability analysis of SPMD codes using expectations

Proceedings of the 21st annual international conference on Supercomputing
A framework for characterizing overlap of communication and computation in parallel applications

Cluster Computing
Rule-based automatic software performance diagnosis and improvement

WOSP '08 Proceedings of the 7th international workshop on Software and performance
A scalable tool architecture for diagnosing wait states in massively parallel applications

Parallel Computing
Measuring causal propagation of overhead of inefficiencies in parallel applications

PDCS '07 Proceedings of the 19th IASTED International Conference on Parallel and Distributed Computing and Systems
Rule-based automatic software performance diagnosis and improvement

Performance Evaluation
Scalable Identification of Load Imbalance in Parallel Executions Using Call Path Profiles

Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
Automatic performance debugging of SPMD-style parallel programs

Journal of Parallel and Distributed Computing
Model-based performance diagnosis of master-worker parallel computations

Euro-Par'06 Proceedings of the 12th international conference on Parallel Processing
Soft computing approach to performance analysis of parallel and distributed programs

Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing
PAS2P tool, parallel application signature for performance prediction

PARA'10 Proceedings of the 10th international conference on Applied Parallel and Scientific Computing - Volume Part I
ADP: automated diagnosis of performance pathologies using hardware events

Proceedings of the 12th ACM SIGMETRICS/PERFORMANCE joint international conference on Measurement and Modeling of Computer Systems
Rule-based automatic software performance diagnosis and improvement

Performance Evaluation

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a technique for performance analysis that helps users understand the communication behavior of their message passing applications. Our method automatically classifies individual communication operations and it reveals the cause of communication inefficiencies in the application. This classification allows the developer to focus quickly on the culprits of truly inefficient behavior, rather than manually foraging through massive amounts of performance data. Specifically, we trace the message operations of MPI applications and then classify each individual communication event using decision tree classification, a supervised learning technique. We train our decision tree using microbenchmarks that demonstrate both efficient and inefficient communication. Since our technique adapts to the target system's configuration through these microbenchmarks, we can simultaneously automate the performance analysis process and improve classification accuracy. Our experiments on four applications demonstrate that our technique can improve the accuracy of performance analysis, and dramatically reduce the amount of data that users must encounter