Introduction to parallel computing: design and analysis of algorithms
Introduction to parallel computing: design and analysis of algorithms
Portable profiling and tracing for parallel, scientific applications using C++
SPDT '98 Proceedings of the SIGMETRICS symposium on Parallel and distributed tools
Execution-driven performance analysis for distributed and parallel systems
Proceedings of the 2nd international workshop on Software and performance
Performance technology for complex parallel and distributed systems
Distributed and parallel systems
A scalable cross-platform infrastructure for application performance tuning using hardware counters
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
A hierarchical classification of overheads in parallel programs
Proceedings of the First IFIP TC10 International Workshop on Software Engineering for Parallel and Distributed Systems
The MPI Standard for Message Passing
HPCN Europe 1994 Proceedings of the nternational Conference and Exhibition on High-Performance Computing and Networking Volume II: Networking and Tools
Gprof: A call graph execution profiler
SIGPLAN '82 Proceedings of the 1982 SIGPLAN symposium on Compiler construction
VFC: The Vienna Fortran Compiler
Scientific Programming
Modeling and detecting performance problems for distributed and parallel programs with JavaPSL
Proceedings of the 2001 ACM/IEEE conference on Supercomputing
SCALEA: A Performance Analysis Tool for Distributed and Parallel Programs
Euro-Par '02 Proceedings of the 8th International Euro-Par Conference on Parallel Processing
Performance Analysis for MPI Applications with SCALEA
Proceedings of the 9th European PVM/MPI Users' Group Meeting on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Grid harvest service: a performance system of grid computing
Journal of Parallel and Distributed Computing
Performance modeling of parallel applications on MPSoCs
SOC'09 Proceedings of the 11th international conference on System-on-chip
Search of performance inefficiencies in message passing applications with KappaPI 2 tool
PARA'06 Proceedings of the 8th international conference on Applied parallel computing: state of the art in scientific computing
Automatic performance analysis of message passing applications using the KappaPI 2 tool
PVM/MPI'05 Proceedings of the 12th European PVM/MPI users' group conference on Recent Advances in Parallel Virtual Machine and Message Passing Interface
Hi-index | 0.00 |
In this paper we give an overview of SCALEA, which is a new performance analysis tool for OpenMP, MPI, HPF, and mixed parallel/distributed programs. SCALEA instruments, executes and measures programs and computes a variety of performance overheads based on a novel overhead classification. Source code and HW-profiling is combined in a single system which significantly extends the scope of possible overheads that can be measured and examined, ranging from HW-counters, such as the number of cache misses or floating point operations, to more complex performance metrics, such as control or loss of parallelism. Moreover, SCALEA uses a new representation of code regions, called the dynamic code region call graph, which enables detailed overhead analysis for arbitrary code regions. An instrumentation description file is used to relate performance information to code regions of the input program and to reduce instrumentation overhead. Several experiments with realistic codes that cover MPI, OpenMP, HPF, and mixed OpenMP/MPI codes demonstrate the usefulness of SCALEA.