A compiler for exploiting nested parallelism in OpenMP programs
Parallel Computing - OpenMp
Nested parallelization with OpenMP
International Journal of Parallel Programming
Nested parallelism in the OMPI OpenmP/C compiler
Euro-Par'07 Proceedings of the 13th international Euro-Par conference on Parallel Processing
Exploiting fine-grain thread parallelism on multicore architectures
Scientific Programming - Software Development for Multi-core Computing Systems
Fast and lightweight support for nested parallelism on cluster-based embedded many-cores
DATE '12 Proceedings of the Conference on Design, Automation and Test in Europe
CUDA-NP: realizing nested thread-level parallelism in GPGPU applications
Proceedings of the 19th ACM SIGPLAN symposium on Principles and practice of parallel programming
Hi-index | 0.01 |
In this work we present a microbenchmark methodology forassessing the overheads associated with nested parallelism in OpenMP.Our techniques are based on extensions to the well known EPCC microbenchmarksuite that allow measuring the overheads of OpenMPconstructs when they are effected in inner levels of parallelism. Themethodology is simple but powerful enough and has enabled us to gaininteresting insight into problems related to implementing and supportingnested parallelism. We measure and compare a number of commercialand freeware compilation systems. Our general conclusion is that whilenested parallelism is fortunately supported by many current implementations,the performance of this support is rather problematic. Thereseem to exist issues which have not yet been addressed effectively, asmost OpenMP systems do not exhibit a graceful reaction when made toexecute inner levels of concurrency.