The implementation of the Cilk-5 multithreaded language
PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
CellSs: a programming model for the cell BE architecture
Proceedings of the 2006 ACM/IEEE conference on Supercomputing
Parallel Programmability and the Chapel Language
International Journal of High Performance Computing Applications
Support for OpenMP tasks in Nanos v4
CASCON '07 Proceedings of the 2007 conference of the center for advanced studies on Collaborative research
Proceedings of the conference on Design, automation and test in Europe
An adaptive cut-off for task parallelism
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
An Experimental Evaluation of the New OpenMP Tasking Model
Languages and Compilers for Parallel Computing
Solving Large, Irregular Graph Problems Using Adaptive Work-Stealing
ICPP '08 Proceedings of the 2008 37th International Conference on Parallel Processing
OpenMP tasks in IBM XL compilers
CASCON '08 Proceedings of the 2008 conference of the center for advanced studies on collaborative research: meeting of minds
Available task-level parallelism on the Cell BE
Scientific Programming - High Performance Computing with the Cell Broadband Engine
The design of a task parallel library
Proceedings of the 24th ACM SIGPLAN conference on Object oriented programming systems languages and applications
Evaluation of OpenMP task scheduling strategies
IWOMP'08 Proceedings of the 4th international conference on OpenMP in a new era of parallelism
ICA3PP'11 Proceedings of the 11th international conference on Algorithms and architectures for parallel processing - Volume Part II
libEOMP: a portable OpenMP runtime library based on MCA APIs for embedded systems
Proceedings of the 2013 International Workshop on Programming Models and Applications for Multicores and Manycores
Portable mapping of openMP to multicore embedded systems using MCA APIs
Proceedings of the 14th ACM SIGPLAN/SIGBED conference on Languages, compilers and tools for embedded systems
Hi-index | 0.00 |
OpenMP task is the most significant feature in the new specification, which provides us with a way to handle unstructured parallelism. This paper presents a runtime library of task model on Cell heterogeneous multicore, which attempts to maximally utilize architectural advantages. Moreover, we propose two optimizations, an original scheduling strategy and an adaptive cut-off technique. The former combines breadth-first with the work-first scheduling strategy. While the latter adaptively chooses the optimal cut-off technique between max number of tasks and max task recursion level according to application characteristics. Performance evaluations indicate that our scheme achieves a speedup factor from 3.4 to 7.2 compared to serial executions.