Adaptive granularity control in task parallel programs using multiversioning

Authors:
Peter Thoman;Herbert Jordan;Thomas Fahringer
Affiliations:
Institute of Computer Science, University of Innsbruck, Innsbruck, Austria;Institute of Computer Science, University of Innsbruck, Innsbruck, Austria;Institute of Computer Science, University of Innsbruck, Innsbruck, Austria
Venue:
Euro-Par'13 Proceedings of the 19th international conference on Parallel Processing
Year:
2013

Citing 11
Cited 0

Cilk: an efficient multithreaded runtime system

PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
Lazy Task Creation: A Technique for Increasing the Granularity of Parallel Programs

IEEE Transactions on Parallel and Distributed Systems
Recursion Unrolling for Divide and Conquer Programs

LCPC '00 Proceedings of the 13th International Workshop on Languages and Compilers for Parallel Computing-Revised Papers
Cache oblivious stencil computations

Proceedings of the 19th annual international conference on Supercomputing
An adaptive cut-off for task parallelism

Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Barcelona OpenMP Tasks Suite: A Set of Benchmarks Targeting the Exploitation of Task Parallelism in OpenMP

ICPP '09 Proceedings of the 2009 International Conference on Parallel Processing
Adaptive Multi-versioning for OpenMP Parallelization via Machine Learning

ICPADS '09 Proceedings of the 2009 15th International Conference on Parallel and Distributed Systems
On the granularity of divide-and-conquer parallelism

FP'95 Proceedings of the 1995 international conference on Functional Programming
OpenMP task scheduling strategies for multicore NUMA systems

International Journal of High Performance Computing Applications
LIBKOMP, an efficient openMP runtime system for both fork-join and data flow paradigms

IWOMP'12 Proceedings of the 8th international conference on OpenMP in a Heterogeneous World
A multi-objective auto-tuning framework for parallel codes

SC '12 Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis

Quantified Score

Hi-index	0.00

Visualization

Abstract

Task parallelism is a programming technique that has been shown to be applicable in a wide variety of problem domains. A central parameter that needs to be controlled to ensure efficient execution of task-parallel programs is the granularity of tasks. When they are too coarse-grained, scalability and load balance suffer, while very fine-grained tasks introduce execution overheads. We present a combined compiler and runtime approach that enables automatic granularity control. Starting from recursive, task parallel programs, our compiler generates multiple versions of each task, increasing granularity by task unrolling and subsequent removal of superfluous synchronization primitives. A runtime system then selects among these task versions of varying granularity by tracking task demand. Benchmarking on a set of task parallel programs using a work-stealing scheduler demonstrates that our approach is generally effective. For fine-grained tasks, we can achieve reductions in execution time exceeding a factor of 6, compared to state-of-the-art implementations.