On the instrumentation of OpenMP and ompss tasking constructs

Authors:
Harald Servat;Xavier Teruel;Germán Llort;Alejandro Duran;Judit Giménez;Xavier Martorell;Eduard Ayguadé;Jesús Labarta
Affiliations:
Barcelona Supercomputing Center, Spain, Universitat Politècnica de Catalunya, Spain;Barcelona Supercomputing Center, Spain;Barcelona Supercomputing Center, Spain, Universitat Politècnica de Catalunya, Spain;Barcelona Supercomputing Center, Spain, Intel Corporation, Barcelona, Catalunya, Spain;Barcelona Supercomputing Center, Spain, Universitat Politècnica de Catalunya, Spain;Barcelona Supercomputing Center, Spain, Universitat Politècnica de Catalunya, Spain;Barcelona Supercomputing Center, Spain, Universitat Politècnica de Catalunya, Spain;Barcelona Supercomputing Center, Spain, Universitat Politècnica de Catalunya, Spain
Venue:
Euro-Par'12 Proceedings of the 18th international conference on Parallel processing workshops
Year:
2012

Citing 11
Cited 0

Design and Prototype of a Performance Tool Interface for OpenMP

The Journal of Supercomputing
The MPI Standard for Message Passing

HPCN Europe 1994 Proceedings of the nternational Conference and Exhibition on High-Performance Computing and Networking Volume II: Networking and Tools
An API for Runtime Code Patching

International Journal of High Performance Computing Applications
The Tau Parallel Performance System

International Journal of High Performance Computing Applications
Performance Profiling for OpenMP Tasks

IWOMP '09 Proceedings of the 5th International Workshop on OpenMP: Evolving OpenMP in an Age of Extreme Parallelism
A Proposal to Extend the OpenMP Tasking Model for Heterogeneous Architectures

IWOMP '09 Proceedings of the 5th International Workshop on OpenMP: Evolving OpenMP in an Age of Extreme Parallelism
Barcelona OpenMP Tasks Suite: A Set of Benchmarks Targeting the Exploitation of Task Parallelism in OpenMP

ICPP '09 Proceedings of the 2009 International Conference on Parallel Processing
Optimizing the exploitation of multicore processors and GPUs with OpenMP and OpenCL

LCPC'10 Proceedings of the 23rd international conference on Languages and compilers for parallel computing
Productive cluster programming with OmpSs

Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part I
A ROSE-Based OpenMP 3.0 research compiler supporting multiple runtime libraries

IWOMP'10 Proceedings of the 6th international conference on Beyond Loop Level Parallelism in OpenMP: accelerators, Tasking and more
How to reconcile event-based performance analysis with tasking in OpenMP

IWOMP'10 Proceedings of the 6th international conference on Beyond Loop Level Parallelism in OpenMP: accelerators, Tasking and more

Quantified Score

Hi-index	0.00

Visualization

Abstract

Parallelism has become more and more commonplace with the advent of the multicore processors. Although different parallel programming models have arisen to exploit the computing capabilities of such processors, developing applications that take benefit of these processors may not be easy. And what is worse, the performance achieved by the parallel version of the application may not be what the developer expected, as a result of a dubious utilization of the resources offered by the processor. We present in this paper a fruitful synergy of a shared memory parallel compiler and runtime, and a performance extraction library. The objective of this work is not only to reduce the performance analysis life-cycle when doing the parallelization of an application, but also to extend the analysis experience of the parallel application by incorporating data that is only known in the compiler and runtime side. Additionally we present performance results obtained with the execution of instrumented application and evaluate the overhead of the instrumentation.