Proceedings of the 29th annual ACM/IEEE international symposium on Microarchitecture
Managing performance analysis with dynamic statistical projection pursuit
SC '99 Proceedings of the 1999 ACM/IEEE conference on Supercomputing
Introduction to Algorithms
Automatically characterizing large scale program behavior
Proceedings of the 10th international conference on Architectural support for programming languages and operating systems
Euro-Par '02 Proceedings of the 8th International Euro-Par Conference on Parallel Processing
Compact application signatures for parallel and distributed scientific codes
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Scalable analysis techniques for microprocessor performance counter metrics
Proceedings of the 2002 ACM/IEEE conference on Supercomputing
Gprof: A call graph execution profiler
SIGPLAN '82 Proceedings of the 1982 SIGPLAN symposium on Compiler construction
Data Mining: Concepts and Techniques
Data Mining: Concepts and Techniques
Construction and Compression of Complete Call Graphs for Post-Mortem Program Trace Analysis
ICPP '05 Proceedings of the 2005 International Conference on Parallel Processing
PerfExplorer: A Performance Data Mining Framework For Large-Scale Parallel Computing
SC '05 Proceedings of the 2005 ACM/IEEE conference on Supercomputing
Incremental call-path profiling: Research Articles
Concurrency and Computation: Practice & Experience - European–American Working Group on Automatic Performance Analysis (APART)
SCALASCA Parallel Performance Analyses of SPEC MPI2007 Applications
SIPEW '08 Proceedings of the SPEC international workshop on Performance Evaluation: Metrics, Models and Benchmarks
On using incremental profiling for the performance analysis of shared memory parallel applications
Euro-Par'07 Proceedings of the 13th international Euro-Par conference on Parallel Processing
Trace-based performance analysis for the petascale simulation code FLASH
International Journal of High Performance Computing Applications
Hi-index | 0.00 |
The performance behavior of parallel simulations often changes considerably as the simulation progresses --- with potentially process-dependent variations of temporal patterns. While call-path profiling is an established method of linking a performance problem to the context in which it occurs, call paths reveal only little information about the temporal evolution of performance phenomena. However, generating call-path profiles separately for thousands of iterations may exceed available buffer space --- especially when the call tree is large and more than one metric is collected. In this paper, we present a runtime approach for the semantic compression of call-path profiles based on incremental clustering of a series of single-iteration profiles that scales in terms of the number of iterations without sacrificing important performance details. Our approach offers low runtime overhead by using only a condensed version of the profile data when calculating distances and accounts for process-dependent variations by making all clustering decisions locally.