Allocating Independent Subtasks on Parallel Processors
IEEE Transactions on Software Engineering
Guided self-scheduling: A practical scheduling scheme for parallel supercomputers
IEEE Transactions on Computers
Factoring: a method for scheduling parallel loops
Communications of the ACM
A dynamic scheduling method for irregular parallel programs
PLDI '92 Proceedings of the ACM SIGPLAN 1992 conference on Programming language design and implementation
Safe self-scheduling: a parallel loop scheduling scheme for shared-memory multiprocessors
International Journal of Parallel Programming
Load-sharing in heterogeneous systems via weighted factoring
Proceedings of the eighth annual ACM symposium on Parallel algorithms and architectures
Allocating independent tasks to parallel processors: an experimental study
Journal of Parallel and Distributed Computing - Special issue on dynamic load balancing
Affinity scheduling of unbalanced workloads
Proceedings of the 1994 ACM/IEEE conference on Supercomputing
Trapezoid Self-Scheduling: A Practical Scheduling Scheme for Parallel Compilers
IEEE Transactions on Parallel and Distributed Systems
Using Processor Affinity in Loop Scheduling on Shared-Memory Multiprocessors
IEEE Transactions on Parallel and Distributed Systems
Scheduling Data-Parallel Computations on Heterogeneous and Time-Shared Environments
Euro-Par '98 Proceedings of the 4th International Euro-Par Conference on Parallel Processing
Hi-index | 0.00 |
We investigate particularly simple algorithms for optimizing the trade-off between load imbalance and assignment overheads in dynamic multiprocessor scheduling scenarios, when the information that is available about the processing time of a task before it is completed is vague. We describe a simple and elegant generic algorithm that, in a very general model, always comes surprisingly close to the theoretical optimum, and the performance of which we can analyze exactly with respect to constant factors. In contrast, we prove that algorithms that assign tasks in equal-sized portions perform far from optimal in general. In fact, we give evidence that the performance of our generic scheme cannot be improved by any constant factor without sacrificing the simplicity of the algorithm. We also give lower bounds on the performance of the various decreasing-size heuristics that have typically been used so far in concrete applications.