Mul-T: a high-performance parallel Lisp
PLDI '89 Proceedings of the ACM SIGPLAN 1989 Conference on Programming language design and implementation
Lazy task creation: a technique for increasing the granularity of parallel programs
LFP '90 Proceedings of the 1990 ACM conference on LISP and functional programming
Using the run-time sizes of data structures to guide parallel-thread creation
LFP '94 Proceedings of the 1994 ACM conference on LISP and functional programming
The design, implementation and evaluation of Jade: a portable, implicitly parallel programming language
Lazy threads: implementing a fast parallel call
Journal of Parallel and Distributed Computing - Special issue on multithreading for multiprocessors
The implementation of the Cilk-5 multithreaded language
PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
StackThreads/MP: integrating futures into calling standards
Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Automatic parallelization of divide and conquer algorithms
Proceedings of the seventh ACM SIGPLAN symposium on Principles and practice of parallel programming
Scheduling threads for low space requirement and good locality
Proceedings of the eleventh annual ACM symposium on Parallel algorithms and architectures
The data locality of work stealing
Proceedings of the twelfth annual ACM symposium on Parallel algorithms and architectures
Efficient Parallel Execution of Irregular Recursive Programs
IEEE Transactions on Parallel and Distributed Systems
Efficient Procedures for Using Matrix Algorithms
Proceedings of the 2nd Colloquium on Automata, Languages and Programming
Support for OpenMP tasks in Nanos v4
CASCON '07 Proceedings of the 2007 conference of the center for advanced studies on Collaborative research
Evaluation of OpenMP task scheduling strategies
IWOMP'08 Proceedings of the 4th international conference on OpenMP in a new era of parallelism
On the granularity of divide-and-conquer parallelism
FP'95 Proceedings of the 1995 international conference on Functional Programming
Evaluating OpenMP 3.0 Run Time Systems on Unbalanced Task Graphs
IWOMP '09 Proceedings of the 5th International Workshop on OpenMP: Evolving OpenMP in an Age of Extreme Parallelism
Lazy binary-splitting: a run-time adaptive work-stealing scheduler
Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming
OpenMP tasking analysis for programmers
CASCON '09 Proceedings of the 2009 Conference of the Center for Advanced Studies on Collaborative Research
Exploiting fine-grained parallelism on cell processors
Euro-Par'10 Proceedings of the 16th international Euro-Par conference on Parallel processing: Part II
Scheduling task parallelism on multi-socket multicore systems
Proceedings of the 1st International Workshop on Runtime and Operating Systems for Supercomputers
A runtime implementation of OpenMP tasks
IWOMP'11 Proceedings of the 7th international conference on OpenMP in the Petascale era
Support for OpenMP tasks on cell architecture
ICA3PP'10 Proceedings of the 10th international conference on Algorithms and Architectures for Parallel Processing - Volume Part II
OpenMP task scheduling strategies for multicore NUMA systems
International Journal of High Performance Computing Applications
ICCSA'12 Proceedings of the 12th international conference on Computational Science and Its Applications - Volume Part I
Design and implementation of a customizable work stealing scheduler
Proceedings of the 3rd International Workshop on Runtime and Operating Systems for Supercomputers
Adaptive granularity control in task parallel programs using multiversioning
Euro-Par'13 Proceedings of the 19th international conference on Parallel Processing
Hi-index | 0.00 |
In task parallel languages, an important factor for achieving a good performance is the use of a cut-off technique to reduce the number of tasks created. Using a cut-off to avoid an excessive number of tasks helps the runtime system to reduce the total overhead associated with task creation, particularlt if the tasks are fine grain. Unfortunately, the best cut-off technique its usually dependent on the application structure or even the input data of the application. We propose a new cut-off technique that, using information from the application collected at runtime, decides which tasks should be pruned to improve the performance of the application. This technique does not rely on the programmer to determine the cut-off technique that is best suited for the application. We have implemented this cut-off in the context of the new OpenMP tasking model. Our evaluation, with a variety of applications, shows that our adaptive cut-off is able to make good decisions and most of the time matches the optimal cut-off that could be set by hand by a programmer.