The implementation of the Cilk-5 multithreaded language
PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
Task Parallelism in a High Performance Fortran Framework
IEEE Parallel & Distributed Technology: Systems & Technology
IPDPS '02 Proceedings of the 16th International Parallel and Distributed Processing Symposium
Task Parallelism and High-Performance Languages
The Data Parallel Programming Model: Foundations, HPF Realization, and Scientific Applications
A SOFTWARE ARCHITECTURE FOR MULTIDISCIPLINARY APPICATIONS: INTEGRATING TASK AND DATA PARALLELISM
A SOFTWARE ARCHITECTURE FOR MULTIDISCIPLINARY APPICATIONS: INTEGRATING TASK AND DATA PARALLELISM
A comparison of task pools for dynamic load balancing of irregular algorithms: Research Articles
Concurrency and Computation: Practice & Experience
Sequoia: programming the memory hierarchy
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Sequoia: programming the memory hierarchy
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
An adaptive cut-off for task parallelism
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
A Proposal for Task Parallelism in OpenMP
IWOMP '07 Proceedings of the 3rd international workshop on OpenMP: A Practical Programming Model for the Multi-Core Era
Scalability Evaluation of Barrier Algorithms for OpenMP
IWOMP '09 Proceedings of the 5th International Workshop on OpenMP: Evolving OpenMP in an Age of Extreme Parallelism
PFunc: modern task parallelism for modern high performance computing
Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
ICPP '09 Proceedings of the 2009 International Conference on Parallel Processing
Open Source Software Support for the OpenMP Runtime API for Profiling
ICPPW '09 Proceedings of the 2009 International Conference on Parallel Processing Workshops
Evaluation of OpenMP task scheduling strategies
IWOMP'08 Proceedings of the 4th international conference on OpenMP in a new era of parallelism
Performance instrumentation and compiler optimizations for MPI/OpenMP applications
IWOMP'05/IWOMP'06 Proceedings of the 2005 and 2006 international conference on OpenMP shared memory parallel programming
Toward enhancing OpenMP's work-sharing directives
Euro-Par'06 Proceedings of the 12th international conference on Parallel Processing
LIBKOMP, an efficient openMP runtime system for both fork-join and data flow paradigms
IWOMP'12 Proceedings of the 8th international conference on OpenMP in a Heterogeneous World
Assessing OpenMP tasking implementations on NUMA architectures
IWOMP'12 Proceedings of the 8th international conference on OpenMP in a Heterogeneous World
libEOMP: a portable OpenMP runtime library based on MCA APIs for embedded systems
Proceedings of the 2013 International Workshop on Programming Models and Applications for Multicores and Manycores
Experiences Developing the OpenUH Compiler and Runtime Infrastructure
International Journal of Parallel Programming
Hi-index | 0.00 |
Many task-based programming models have been developed and refined in recent years to support application development for shared memory platforms. Asynchronous tasks are a powerful programming abstraction that offer flexibility in conjunction with great expressivity. Research involving standardized tasking models like OpenMP and nonstandardized models like Cilk facilitate improvements in many tasking implementations. While the asynchronous task is arguably a fundamental element of parallel programming, it is the implementation, not the concept, that makes all the difference with respect to the performance that is obtained by a program that is parallelized using tasks. There are many approaches to implementing tasking constructs, but few have also given attention to providing the user with some capabilities for fine tuning the execution of their code. This paper provides an overview of one OpenMP implementation, highlights its main features, discusses the implementation, and demonstrates its performance with user controlled runtime variables.