Batch-Scheduling dags for internet-based computing

  • Authors:
  • Grzegorz Malewicz;Arnold L. Rosenberg

  • Affiliations:
  • Dept. of Computer Science, Univ. of Alabama, Tuscaloosa, AL;Dept. of Computer Science, Univ. of Massachusetts, Amherst, MA

  • Venue:
  • Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing
  • Year:
  • 2005

Quantified Score

Hi-index 0.01

Visualization

Abstract

The process of scheduling computations for Internet-based computing presents challenges not encountered with more traditional computing platforms. The looser coupling among participating computers makes it harder to utilize remote clients well, and raises the specter of a kind of “gridlock” that ensues when a computation stalls because no new tasks are eligible for execution. This paper studies the problem of scheduling computation-dags in a manner that renders tasks eligible for execution at the maximum possible rate. Earlier work has developed a framework for such scheduling when a new task is allocated to a remote client as soon as it returns the results from an earlier task. The proof in that work that many dags cannot be scheduled optimally within this paradigm signaled the need for a companion theory that addresses the scheduling problem for all computation-dags. A new, batched, scheduling paradigm for Internet-based computing is developed in this work. Although optimal batched schedules always exist, computing such a schedule is NP-Hard, even for bipartite dags. In response, a polynomial-time algorithm is developed for producing optimal batched schedules for a rich family of dags obtained by “composing” tree-structured building-block dags. Finally, a fast heuristic schedule is developed for “expansive” dags.