Lazy task creation: a technique for increasing the granularity of parallel programs
LFP '90 Proceedings of the 1990 ACM conference on LISP and functional programming
Cilk: an efficient multithreaded runtime system
PPOPP '95 Proceedings of the fifth ACM SIGPLAN symposium on Principles and practice of parallel programming
The implementation of the Cilk-5 multithreaded language
PLDI '98 Proceedings of the ACM SIGPLAN 1998 conference on Programming language design and implementation
Automatic Subspace Clustering of High Dimensional Data
Data Mining and Knowledge Discovery
Scheduling multithreaded computations by work stealing
SFCS '94 Proceedings of the 35th Annual Symposium on Foundations of Computer Science
An adaptive cut-off for task parallelism
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
An Experimental Evaluation of the New OpenMP Tasking Model
Languages and Compilers for Parallel Computing
OpenMP tasks in IBM XL compilers
CASCON '08 Proceedings of the 2008 conference of the center for advanced studies on collaborative research: meeting of minds
UTS: an unbalanced tree search benchmark
LCPC'06 Proceedings of the 19th international conference on Languages and compilers for parallel computing
Evaluation of OpenMP task scheduling strategies
IWOMP'08 Proceedings of the 4th international conference on OpenMP in a new era of parallelism
Proceedings of the 8th ACM International Conference on Computing Frontiers
ICCSA'12 Proceedings of the 12th international conference on Computational Science and Its Applications - Volume Part I
Hi-index | 0.00 |
The UTS benchmark is used to evaluate task parallelism in OpenMP 3.0 as implemented in a number of recently released compilers and run-time systems. UTS performs parallel search of an irregular and unpredictable search space, as arises e.g. in combinatorial optimization problems. As such UTS presents a highly unbalanced task graph that challenges scheduling, load balancing, termination detection, and task coarsening strategies. Scalability and overheads are compared for OpenMP 3.0, Cilk, and an OpenMP implementation of the benchmark without tasks that performs all scheduling, load balancing, and termination detection explicitly. Current OpenMP 3.0 implementations generally exhibit poor behavior on the UTS benchmark.