Scheduling precedence graphs in systems with interprocessor communication times
SIAM Journal on Computing
Automatic determination of grain size for efficient parallel processing
Communications of the ACM - Special issue: multiprocessing
Dynamic tree embeddings in butterflies and hypercubes
SPAA '89 Proceedings of the first annual ACM symposium on Parallel algorithms and architectures
Scheduling parallel program tasks onto arbitrary target machines
Journal of Parallel and Distributed Computing - Special issue: software tools for parallel programming and visualization
Towards an architecture-independent analysis of parallel algorithms
SIAM Journal on Computing
Introduction to algorithms
Network and processor architecture for message-driven computers
VLSI and parallel computation
Proceedings of the 1992 ACM/IEEE conference on Supercomputing
Scheduling and code generation for parallel architectures
Scheduling and code generation for parallel architectures
Partitioning and Scheduling Parallel Programs for Multiprocessors
Partitioning and Scheduling Parallel Programs for Multiprocessors
The Design and Analysis of Computer Algorithms
The Design and Analysis of Computer Algorithms
Grain Size Determination for Parallel Processing
IEEE Software
Hypertool: A Programming Aid for Message-Passing Systems
IEEE Transactions on Parallel and Distributed Systems
On the Granularity and Clustering of Directed Acyclic Task Graphs
IEEE Transactions on Parallel and Distributed Systems
A Comparison of Heuristics for Scheduling DAGs on Multiprocessors
Proceedings of the 8th International Symposium on Parallel Processing
IEEE Transactions on Parallel and Distributed Systems
On Parallelizing the Multiprocessor Scheduling Problem
IEEE Transactions on Parallel and Distributed Systems
Static scheduling algorithms for allocating directed task graphs to multiprocessors
ACM Computing Surveys (CSUR)
A formal design notation for real-time systems
ACM Transactions on Software Engineering and Methodology (TOSEM)
Low-Cost Task Scheduling for Distributed-Memory Machines
IEEE Transactions on Parallel and Distributed Systems
Link contention-constrained scheduling and mapping of tasks
Cluster Computing
An Optimal Scheduling Algorithm Based on Task Duplication
IEEE Transactions on Computers
On Exploiting Heterogeneity for Cluster Based Parallel Multithreading Using Task Duplication
The Journal of Supercomputing
A Comparison of General Approaches to Multiprocessor Scheduling
IPPS '97 Proceedings of the 11th International Symposium on Parallel Processing
On the Design of Clustering-based Scheduling Algorithms for Realistic Machine Models
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
Dynamic Task Scheduling with Precedence Constraints and Communication Delays
PaCT '999 Proceedings of the 5th International Conference on Parallel Computing Technologies
On Minimising the Processor Requirements of LogP Schedules
Euro-Par '01 Proceedings of the 7th International Euro-Par Conference Manchester on Parallel Processing
Non-approximability of the Bulk Synchronous Task Scheduling Problem
Euro-Par '02 Proceedings of the 8th International Euro-Par Conference on Parallel Processing
Scheduling Arbitrary Task Graphs on LogP Machines
Euro-Par '99 Proceedings of the 5th International Euro-Par Conference on Parallel Processing
Task Scheduling Algorithms for Heterogeneous Processors
HCW '99 Proceedings of the Eighth Heterogeneous Computing Workshop
Low Memory Cost Dynamic Scheduling of Large Coarse Grain Task Graphs
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Benchmarking the Task Graph Scheduling Algorithms
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
IEEE Transactions on Parallel and Distributed Systems
A task duplication scheme for resolving deadlocks in clustered DAGs
Parallel Computing
On Task Scheduling Accuracy: Evaluation Methodology and Results
The Journal of Supercomputing
Compact DAG representation and its symbolic scheduling
Journal of Parallel and Distributed Computing
Iterative list scheduling for heterogeneous computing
Journal of Parallel and Distributed Computing
Journal of Parallel and Distributed Computing
ICPADS '06 Proceedings of the 12th International Conference on Parallel and Distributed Systems - Volume 1
Automatic choice of scheduling heuristics for parallel/distributed computing
Scientific Programming
Graham's anomalies in case of parallel computation electromagnetic phenomena
ICCOMP'07 Proceedings of the 11th WSEAS International Conference on Computers
A dominant predecessor duplication scheduling algorithm for heterogeneous systems
The Journal of Supercomputing
CellSs: Scheduling techniques to better exploit memory hierarchy
Scientific Programming - High Performance Computing with the Cell Broadband Engine
Scheduling parallel tasks onto NUMA multiprocessors with inter-processor communication overhead
ISPA'03 Proceedings of the 2003 international conference on Parallel and distributed processing and applications
Contention-aware scheduling with task duplication
Journal of Parallel and Distributed Computing
TELE-INFO'06 Proceedings of the 5th WSEAS international conference on Telecommunications and informatics
A novel task scheduling algorithm for distributed heterogeneous computing systems
PARA'04 Proceedings of the 7th international conference on Applied Parallel Computing: state of the Art in Scientific Computing
Scheduling streaming applications on a complex multicore platform
Concurrency and Computation: Practice & Experience
Hi-index | 0.00 |
This paper addresses the problem of scheduling parallel programs represented as directed acyclic task graphs for execution on distributed memory parallel architectures. Because of the high communication overhead in existing parallel machines, a crucial step in scheduling is task clustering, the process of coalescing fine grain tasks into single coarser ones so that the overall execution time is minimized. The task clustering problem is NP-hard, even when the number of processors is unbounded and task duplication is allowed. A simple greedy algorithm is presented for this problem which, for a task graph with arbitrary granularity, produces a schedule whose makespan is at most twice optimal. Indeed, the quality of the schedule improves as the granularity of the task graph becomes larger. For example, if the granularity is at least 1/2, the makespan of the schedule is at most 5/3 times optimal. For a task graph with n tasks and e inter-task communication constraints, the algorithm runs in $O(n(n\ l{\sl g}\ n + e))$ time, which is n times faster than the currently best known algorithm for this problem. Similar algorithms are developed that produce: (1) optimal schedules for coarse grain graphs; (2) 2-optimal schedules for trees with no task duplication; and (3) optimal schedules for coarse grain trees with no task duplication.