On Runtime Parallel Scheduling for Processor Load Balancing
IEEE Transactions on Parallel and Distributed Systems
IEEE Transactions on Parallel and Distributed Systems
On Parallelization of Static Scheduling Algorithms
IEEE Transactions on Software Engineering
Runtime Incremental Parallel Scheduling (RIPS) on Distributed Memory Computers
IEEE Transactions on Parallel and Distributed Systems
Discovery of Parallel Scheduling Algorithms in Cellular Automata-Based Systems
IPDPS '01 Proceedings of the 15th International Parallel & Distributed Processing Symposium
Coevolution and Evolving Parallel Cellular Automata - Based Scheduling Algorithms
Selected Papers from the 5th European Conference on Artificial Evolution
Optimal task assignment in heterogeneous computing systems
HCW '97 Proceedings of the 6th Heterogeneous Computing Workshop (HCW '97)
Benchmarking the Task Graph Scheduling Algorithms
IPPS '98 Proceedings of the 12th. International Parallel Processing Symposium on International Parallel Processing Symposium
Hi-index | 0.00 |
The objective of this research is to propose a low complexity static scheduling and allocation algorithm for message-passing architectures by considering factors such as communication delays, link contention, message routing and network topology. As opposed to the conventional list-scheduling approach, our technique works by first serializing the task graph and "injecting" all the tasks to one processor. The parallel tasks are then 'bubbled up' to other processors and are inserted at appropriate time slots. The edges among the tasks are also scheduled by treating communication links between the processors as resources. The proposed approach takes into account the link contention and underlying communication routing strategy, and can self-adjust on regular as well as arbitrary network topologies. To reduce the complexity, our scheduling algorithm is itself parallelized. To our knowledge, this is the first attempt in designing a parallel algorithm for scheduling. The proposed approach implemented on an iPSC/860 hypercube, while yielding a high speedup in its execution, performs considerably better under a wide range of parameters including the task graph size, communication-to-computation ratio, and the target system topology. Comparisons are made with two other approaches.