Performance technology for complex parallel and distributed systems
Distributed and parallel systems
A tool framework for static and dynamic analysis of object-oriented software with templates
Proceedings of the 2000 ACM/IEEE conference on Supercomputing
The Case for High-Level Parallel Programming in ZPL
IEEE Computational Science & Engineering
Performance Analysis Tools for Parallel Java Applications on Shared-memory Systems
ICPP '02 Proceedings of the 2001 International Conference on Parallel Processing
EARL - A Programmable and Extensible Toolkit for Analyzing Event Traces of Message Passing Programs
HPCN Europe '99 Proceedings of the 7th International Conference on High-Performance Computing and Networking
Automatic performance analysis of hybrid MPI/OpenMP applications
Journal of Systems Architecture: the EUROMICRO Journal - Special issue: Evolutions in parallel distributed and network-based processing
The Tau Parallel Performance System
International Journal of High Performance Computing Applications
Comparing the Usability of Performance Analysis Tools
Euro-Par 2008 Workshops - Parallel Processing
Supporting nested OpenMP parallelism in the TAU performance system
International Journal of Parallel Programming
A Generic and Configurable Source-Code Instrumentation Component
ICCS 2009 Proceedings of the 9th International Conference on Computational Science
Performance modeling of parallel applications on MPSoCs
SOC'09 Proceedings of the 11th international conference on System-on-chip
The Cilkview scalability analyzer
Proceedings of the twenty-second annual ACM symposium on Parallelism in algorithms and architectures
Performance analysis of large-scale OpenMP and hybrid MPI/OpenMP applications with VampirNG
IWOMP'05/IWOMP'06 Proceedings of the 2005 and 2006 international conference on OpenMP shared memory parallel programming
Performance instrumentation and compiler optimizations for MPI/OpenMP applications
IWOMP'05/IWOMP'06 Proceedings of the 2005 and 2006 international conference on OpenMP shared memory parallel programming
Supporting nested OpenMP parallelism in the TAU performance system
IWOMP'05/IWOMP'06 Proceedings of the 2005 and 2006 international conference on OpenMP shared memory parallel programming
An approach to visualize remote socket traffic on the intel Nehalem-EX
Euro-Par 2010 Proceedings of the 2010 conference on Parallel processing
A performance measurement infrastructure for co-array fortran
Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing
Binding nested OpenMP programs on hierarchical memory architectures
IWOMP'10 Proceedings of the 6th international conference on Beyond Loop Level Parallelism in OpenMP: accelerators, Tasking and more
How to reconcile event-based performance analysis with tasking in OpenMP
IWOMP'10 Proceedings of the 6th international conference on Beyond Loop Level Parallelism in OpenMP: accelerators, Tasking and more
Performance analysis techniques for task-based OpenMP applications
IWOMP'12 Proceedings of the 8th international conference on OpenMP in a Heterogeneous World
On the instrumentation of OpenMP and ompss tasking constructs
Euro-Par'12 Proceedings of the 18th international conference on Parallel processing workshops
Extending the scope of the controlled logical clock
Cluster Computing
A new approach for performance analysis of openMP programs
Proceedings of the 27th international ACM conference on International conference on supercomputing
Dynamic thread pinning for phase-based OpenMP programs
Euro-Par'13 Proceedings of the 19th international conference on Parallel Processing
Hi-index | 0.00 |
This paper proposes a performance tools interface for OpenMP, similar in spirit to the MPI profiling interface in its intent to define a clear and portable API that makes OpenMP execution events visible to runtime performance tools. We present our design using a source-level instrumentation approach based on OpenMP directive rewriting. Rules to instrument each directive and their combination are applied to generate calls to the interface consistent with directive semantics and to pass context information (e.g., source code locations) in a portable and efficient way. Our proposed OpenMP performance API further allows user functions and arbitrary code regions to be marked and performance measurement to be controlled using new OpenMP directives. To prototype the proposed OpenMP performance interface, we have developed compatible performance libraries for the Expert automatic event trace analyzer [17, 18] and the TAU performance analysis framework [13]. The directive instrumentation transformations we define are implemented in a source-to-source translation tool called OPARI. Application examples are presented for both Expert and TAU to show the OpenMP performance interface and OPARI instrumentation tool in operation. When used together with the MPI profiling interface (as the examples also demonstrate), our proposed approach provides a portable and robust solution to performance analysis of OpenMP and mixed-mode (OpenMP+MPI) applications.