Quartz: a tool for tuning parallel program performance
SIGMETRICS '90 Proceedings of the 1990 ACM SIGMETRICS conference on Measurement and modeling of computer systems
C4.5: programs for machine learning
C4.5: programs for machine learning
Normalized performance indices for message passing parallel programs
ICS '94 Proceedings of the 8th international conference on Supercomputing
Waiting time analysis and performance visualization in Carnival
SPDT '96 Proceedings of the SIGMETRICS symposium on Parallel and distributed tools
Advances in knowledge discovery and data mining
Advances in knowledge discovery and data mining
Performance debugging shared memory parallel programs using run-time dependence analysis
SIGMETRICS '97 Proceedings of the 1997 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Performance measurements for multithreaded programs
SIGMETRICS '98/PERFORMANCE '98 Proceedings of the 1998 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
Portable profiling and tracing for parallel, scientific applications using C++
SPDT '98 Proceedings of the SIGMETRICS symposium on Parallel and distributed tools
The grid: blueprint for a new computing infrastructure
The grid: blueprint for a new computing infrastructure
Mining in a data-flow environment: experience in network intrusion detection
KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and data mining
Using MPI (2nd ed.): portable parallel programming with the message-passing interface
Using MPI (2nd ed.): portable parallel programming with the message-passing interface
Very high resolution simulation of compressible turbulence on the IBM-SP system
SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Designing and Building Parallel Programs: Concepts and Tools for Parallel Software Engineering
Designing and Building Parallel Programs: Concepts and Tools for Parallel Software Engineering
MPI: The Complete Reference
Parallel Performance Visualization: From Practice to Theory
IEEE Parallel & Distributed Technology: Systems & Technology
Medea: A Tool for Workload Characterization of Parallel Systems
IEEE Parallel & Distributed Technology: Systems & Technology
Statistical scalability analysis of communication operations in distributed applications
PPoPP '01 Proceedings of the eighth ACM SIGPLAN symposium on Principles and practices of parallel programming
Dynamic statistical profiling of communication activity in distributed applications
SIGMETRICS '02 Proceedings of the 2002 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
HiPC '02 Proceedings of the 9th International Conference on High Performance Computing
Scalable analysis techniques for microprocessor performance counter metrics
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Automatic performance analysis of hybrid MPI/OpenMP applications
Journal of Systems Architecture: the EUROMICRO Journal - Special issue: Evolutions in parallel distributed and network-based processing
Aksum: a performance analysis tool for parallel and distributed applications
Performance analysis and grid computing
Specifying performance properties of parallel applications using compound events
On-line monitoring systems and computer tool interoperability
Exploring the Energy-Time Tradeoff in MPI Programs on a Power-Scalable Cluster
IPDPS '05 Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01
Analyzing the Energy-Time Trade-Off in High-Performance Computing Applications
IEEE Transactions on Parallel and Distributed Systems
Scalability analysis of SPMD codes using expectations
Proceedings of the 21st annual international conference on Supercomputing
Rule-based automatic software performance diagnosis and improvement
WOSP '08 Proceedings of the 7th international workshop on Software and performance
Measuring causal propagation of overhead of inefficiencies in parallel applications
PDCS '07 Proceedings of the 19th IASTED International Conference on Parallel and Distributed Computing and Systems
Rule-based automatic software performance diagnosis and improvement
Performance Evaluation
Scalable Identification of Load Imbalance in Parallel Executions Using Call Path Profiles
Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis
Automatic performance debugging of SPMD-style parallel programs
Journal of Parallel and Distributed Computing
Model-based performance diagnosis of master-worker parallel computations
Euro-Par'06 Proceedings of the 12th international conference on Parallel Processing
Soft computing approach to performance analysis of parallel and distributed programs
Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing
PAS2P tool, parallel application signature for performance prediction
PARA'10 Proceedings of the 10th international conference on Applied Parallel and Scientific Computing - Volume Part I
ADP: automated diagnosis of performance pathologies using hardware events
Proceedings of the 12th ACM SIGMETRICS/PERFORMANCE joint international conference on Measurement and Modeling of Computer Systems
Rule-based automatic software performance diagnosis and improvement
Performance Evaluation
Hi-index | 0.00 |
We present a technique for performance analysis that helps users understand the communication behavior of their message passing applications. Our method automatically classifies individual communication operations and it reveals the cause of communication inefficiencies in the application. This classification allows the developer to focus quickly on the culprits of truly inefficient behavior, rather than manually foraging through massive amounts of performance data. Specifically, we trace the message operations of MPI applications and then classify each individual communication event using decision tree classification, a supervised learning technique. We train our decision tree using microbenchmarks that demonstrate both efficient and inefficient communication. Since our technique adapts to the target system's configuration through these microbenchmarks, we can simultaneously automate the performance analysis process and improve classification accuracy. Our experiments on four applications demonstrate that our technique can improve the accuracy of performance analysis, and dramatically reduce the amount of data that users must encounter