DrHJ: the cure to your multicore programming woes
Proceedings of the ACM international conference companion on Object oriented programming systems languages and applications companion
The design and implementation of the habanero-java parallel programming language
Proceedings of the ACM international conference companion on Object oriented programming systems languages and applications companion
Habanero-Java: the new adventures of old X10
Proceedings of the 9th International Conference on Principles and Practice of Programming in Java
Design, verification and applications of a new read-write lock algorithm
Proceedings of the twenty-fourth annual ACM symposium on Parallelism in algorithms and architectures
Integrating task parallelism with actors
Proceedings of the ACM international conference on Object oriented programming systems languages and applications
CnC-Python: multicore programming with high productivity
Proceedings of the 3rd annual conference on Systems, programming, and applications: software for humanity
An implementation of the codelet model
Euro-Par'13 Proceedings of the 19th international conference on Parallel Processing
Hi-index | 0.00 |
Dynamic task parallelism has been identified as a prerequisite for improving productivity and performance on future many-core processors. In dynamic task parallelism, computations are created dynamically and the runtime scheduler is responsible for scheduling the computations across processor cores. The sets of task graphs that can be supported by a dynamic scheduler depend on the underlying task primitives in the parallel programming model, with various classes of fork-join structures used most often in practice. However, many researchers have advocated the benefits of more general task graph structures, and have shown that the use of these task graph structures can lead to improved performance. In this paper, we propose an extension to task parallelism called Data-Driven Tasks (DDTs) that can be used to create arbitrary task graph structures. Unlike a normal task that starts execution upon creation, a DDT specifies its input constraints in an await clause containing a list of Data-Driven Futures (DDFs). A DDF can be viewed as a container with a full/empty state that obeys a dynamic single-assignment rule. The runtime scheduler will then ensure that a task is only scheduled when all the DDFs in its await clause become available (full). There is no constraint on which task performs a put() operation on a DDF. We describe five scheduling algorithms (Coarse-Grain Blocking, Fine-Grain Blocking, Delayed Async, Rollback & Replay, Data-Driven) that can be used to implement DDTs, and include performance evaluations of these five algorithms on a variety of benchmark programs and multi-core platforms. Our results show that the Data-Driven scheduler is the best approach for implementing DDTs, both from the viewpoints of memory efficiency and scalable parallelism.