Compiler algorithms for synchronization
IEEE Transactions on Computers
Computers and Intractability: A Guide to the Theory of NP-Completeness
Computers and Intractability: A Guide to the Theory of NP-Completeness
A framework for determining useful parallelism
ICS '88 Proceedings of the 2nd international conference on Supercomputing
Compiler algorithms for event variable synchronization
ICS '91 Proceedings of the 5th international conference on Supercomputing
Automatic partitioning of a program dependence graph into parallel tasks
IBM Journal of Research and Development
Techniques for integrating parallelizing transformations and compiler-based scheduling methods
Proceedings of the 1992 ACM/IEEE conference on Supercomputing
Self-Timed Resynchronization: A Post-Optimization for Static Multiprocessor Schedules
IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
Phasers: a unified deadlock-free construct for collective and point-to-point synchronization
Proceedings of the 22nd annual international conference on Supercomputing
Chunking parallel loops in the presence of synchronization
Proceedings of the 23rd international conference on Supercomputing
Comparing the usability of library vs. language approaches to task parallelism
Evaluation and Usability of Programming Languages and Tools
Unifying barrier and point-to-point synchronization in OpenMP with phasers
IWOMP'11 Proceedings of the 7th international conference on OpenMP in the Petascale era
Hardware and software tradeoffs for task synchronization on manycore architectures
Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part II
Habanero-Java extensions for scientific computing
Proceedings of the 9th Workshop on Parallel/High-Performance Object-Oriented Scientific Computing
A Transformation Framework for Optimizing Task-Parallel Programs
ACM Transactions on Programming Languages and Systems (TOPLAS)
Hi-index | 0.00 |
This paper studies the optimization problem of enforcing a dependence graph with the minimum number of synchronization operations. For a dependence graph with N vertices, it is shown that binary semaphores may require &Ogr;(N2) operations, compared to &Ogr;(N) operations for counting semaphores. Though the optimization problem of using the minimum number of counting semaphore operations is shown to be NP-complete, we present an approximation algorithm that is observed to be very close to optimal (within 0.5%) on small, randomly generated dependence graphs. A surprising property of the problem is that the inclusion (rather than removal) of transitive edges can actually help reduce the number of synchronization operations.We characterize as class of dependence graphs for which the approximation algorithm is optimal. This class includes forests of fan-in trees, fan-out trees and series-parallel graphs. The number of synchronization operations needed for binary and counting semaphores are compared for randomly generated dependence graphs, using an implementation of the approximation algorithm in LISP/VM. The experimental results show that the use of counting semaphores significantly reduces the total number of synchronization operations, compared to binary semaphores.