UET scheduling with unit interprocessor communication delays
Discrete Applied Mathematics
A bridging model for parallel computation
Communications of the ACM
Towards an architecture-independent analysis of parallel algorithms
SIAM Journal on Computing
A fast static scheduling algorithm for DAGs on an unbounded number of processors
Proceedings of the 1991 ACM/IEEE conference on Supercomputing
Optimal broadcast and summation in the LogP model
SPAA '93 Proceedings of the fifth annual ACM symposium on Parallel algorithms and architectures
Randomized parallel algorithms for backtrack search and branch-and-bound computation
Journal of the ACM (JACM)
Branch-and-bound and backtrack search on mesh-connected arrays of processors
Proceedings of the 4th ACM symposium on Parallel algorithms and architectures
Optimal speedup for backtrack search on a butterfly network
Proceedings of the 3rd ACM symposium on Parallel algorithms and architectures
Proceedings of the seventh annual ACM symposium on Parallel algorithms and architectures
The complexity of scheduling trees with communication delays
Journal of Algorithms
Proceedings of the eighth annual ACM symposium on Parallel algorithms and architectures
LogP: a practical model of parallel computation
Communications of the ACM
Clustering task graphs for message passing architectures
ICS '90 Proceedings of the 4th international conference on Supercomputing
Toward efficient scheduling of evolving computations on rings of processors
Journal of Parallel and Distributed Computing
In search of clusters (2nd ed.)
In search of clusters (2nd ed.)
The Parallel Evaluation of General Arithmetic Expressions
Journal of the ACM (JACM)
Partitioning and Scheduling Parallel Programs for Multiprocessors
Partitioning and Scheduling Parallel Programs for Multiprocessors
Computers and Intractability: A Guide to the Theory of NP-Completeness
Computers and Intractability: A Guide to the Theory of NP-Completeness
An Improved Approximation Algorithm for Scheduling Task Trees on Linear Arrays
IPPS '96 Proceedings of the 10th International Parallel Processing Symposium
The Complexity of Scheduling Problems with Communication Delays for Trees
SWAT '92 Proceedings of the Third Scandinavian Workshop on Algorithm Theory
Efficient Barrier Synchronization Mechanism for the BSP Model on Message-Passing Architectures
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Guidelines for Scheduling Some Common Computation-Dags for Internet-Based Computing
IEEE Transactions on Computers
Extending IC-scheduling via the Sweep Algorithm
Journal of Parallel and Distributed Computing
Area-maximizing schedules for series-parallel DAGs
Euro-Par'10 Proceedings of the 16th international Euro-Par conference on Parallel processing: Part II
Hi-index | 0.00 |
Modern hardware and software systems promote a view of parallel systems in which interprocessor communications are uniform and rather expensive in cost. Such systems demand efficient clustering algorithms that aggregate atomic tasks in a way that diminishes the impact of the high communication costs. We develop here a linear-time algorithm that optimally clusters computations that comprise a sequence of disjoint complete up- and/or down-sweeps on a complete binary tree for such parallel environments. Such computations include, for instance, those that implement broadcast, accumulation, and the parallel-prefix operator; such environments include, for instance, networks of workstations or BSP-based programming systems. The schedules produced by our clustering are optimal in the sense of having the exact minimum makespan驴not just an approximation thereof驴accounting for both computation and communication time. We show by simulation that the makespans of the schedules produced by our algorithm are close to half of those produced by the algorithm that yielded the best schedules previously known.