Software—Practice & Experience
A scalable cross-platform infrastructure for application performance tuning using hardware counters
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
Medea: A Tool for Workload Characterization of Parallel Systems
IEEE Parallel & Distributed Technology: Systems & Technology
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
The Hardware Performance Monitor Toolkit
Euro-Par '01 Proceedings of the 7th International Euro-Par Conference Manchester on Parallel Processing
A Dynamic Tracing Mechanism for Performance Analysis of OpenMP Applications
WOMPAT '01 Proceedings of the International Workshop on OpenMP Applications and Tools: OpenMP Shared Memory Parallel Programming
SvPablo: A Multi-Language Architecture-Independent Performance Analysis System
ICPP '99 Proceedings of the 1999 International Conference on Parallel Processing
An API for Runtime Code Patching
International Journal of High Performance Computing Applications
Advances in the TAU performance system
Performance analysis and grid computing
The Tau Parallel Performance System
International Journal of High Performance Computing Applications
Whodunit: transactional profiling for multi-tier applications
Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007
DARC: dynamic analysis of root causes of latency distributions
SIGMETRICS '08 Proceedings of the 2008 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
Space-efficient time-series call-path profiling of parallel applications
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
An automated component-based performance experiment environment
Proceedings of the 2009 Workshop on Component-Based High Performance Computing
Statistical methods for automatic performance bottleneck detection in MPI based programs
ICCS'05 Proceedings of the 5th international conference on Computational Science - Volume Part I
Hi-index | 0.00 |
CATCH is a profiler for parallel applications that collects hardware performance counters information for each function called in the program, based on the path that led to the function invocation. It automatically instruments the binary of the target application independently of the programming language. It supports MPI, OpenMP, and hybrid applications and integrates the performance data collected for different processes and threads. Functions representing the bodies of OpenMP constructs are also monitored and mapped back to the source code. Performance data is generated in XML for visualization with a graphical user interface that displays the data simultaneously with the source code sections they refer to.