A Robust Compile Time Method for SchedulingTask Parallelism on Distributed Memory Machines

  • Authors:
  • Sekhar Darbha;Santosh Pande

  • Affiliations:
  • ECE Department, Rutgers University, Piscataway, NJ 08855-0909, Email: darbha@ece.rutgers.edu;ECECS Department, University of Cincinnati, Cincinnati, OH 45221-0030, Email: santosh.pande@uc.edu

  • Venue:
  • The Journal of Supercomputing
  • Year:
  • 1998

Quantified Score

Hi-index 0.00

Visualization

Abstract

The problem of compile time scheduling of tasks of a program represented as a directed acyclic graph (DAG) is NP-hard in its general form. A number of approaches have been proposed which attempt to solve the problem either sub-optimally for general cases or optimally for restrictive special cases. But all the compile timeapproaches suffer due to the inability to accurately model the computation and communication costs of the target architecture. A desirable property of a compile time scheduling algorithm is robustness against the variations in the computation and communication costs so that the run time performance is close to the compile time estimates; this aspect of scheduling has been left open in the literature.This paper first introduces a compile time scheduling algorithm for a variable number of available processors and then examines the impact of change of computation and communication costs on the generated schedule. The cost variations for all the nodes and all the edges are assumed to be uniform (in other words, all the node costs change by the same ratio and the edge costs change by the same ratio). This sort of variations could occur due to the inaccuracies in estimating the instruction execution times or the message passing delays. The ratio of theschedule length obtained by the new schedule based on the modified costs to the schedule length obtained by using the modified costs on the original schedule (obtained by initial costs) is used as a measure of the robustness of thealgorithm. The essential conditions for robustness of the proposed algorithm are discussed and are demonstrated through an experimental study.