International Journal of High Speed Computing
Thread fork/join techniques for multi-level parallelism exploitation in NUMA multiprocessors
ICS '99 Proceedings of the 13th international conference on Supercomputing
Double standards: bringing task parallelism to HPF via the message passing interface
Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
A Library Implementation of the Nano-Threads Programming Model
Euro-Par '96 Proceedings of the Second International Euro-Par Conference on Parallel Processing-Volume II
Dual-Level Parallelism Exploitation with OpenMP in Coastal Ocean Circulation Modeling
ISHPC '02 Proceedings of the 4th International Symposium on High Performance Computing
Defining and Supporting Pipelined Executions in OpenMP
WOMPAT '01 Proceedings of the International Workshop on OpenMP Applications and Tools: OpenMP Shared Memory Parallel Programming
Journal of Parallel and Distributed Computing - Special issue: 18th International parallel and distributed processing symposium
An Efficient OpenMP Runtime System for Hierarchical Architectures
IWOMP '07 Proceedings of the 3rd international workshop on OpenMP: A Practical Programming Model for the Multi-Core Era
Extending the OpenMP standard for thread mapping and grouping
IWOMP'05/IWOMP'06 Proceedings of the 2005 and 2006 international conference on OpenMP shared memory parallel programming
Toward enhancing OpenMP's work-sharing directives
Euro-Par'06 Proceedings of the 12th international conference on Parallel Processing
Runtime adjustment of parallel nested loops
WOMPAT'04 Proceedings of the 5th international conference on OpenMP Applications and Tools: shared Memory Parallel Programming with OpenMP
Fast and lightweight support for nested parallelism on cluster-based embedded many-cores
DATE '12 Proceedings of the Conference on Design, Automation and Test in Europe
Hi-index | 0.00 |
This paper presents a set of proposals for the OpenMP shared-memory programming model oriented towards the definition of thread groups in the framework of nested parallelism. The paper also describes the additional functionalities required in the runtime library supporting the parallel execution. The extensions have been implemented in the OpenMP NanosCompiler and evaluated in a set of real applications and benchmarks. In this paper we present experimental results for one of these applications.