Effective distributed scheduling of parallel workloads

  • Authors:
  • Andrea C. Dusseau;Remzi H. Arpaci;David E. Culler

  • Affiliations:
  • Computer Science Division, University of California, Berkeley;Computer Science Division, University of California, Berkeley;Computer Science Division, University of California, Berkeley

  • Venue:
  • Proceedings of the 1996 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
  • Year:
  • 1996

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present a distributed algorithm for time-sharing parallel workloads that is competitive with coscheduling. Implicit scheduling allows each local scheduler in the system to make independent decisions that dynamically coordinate the scheduling of cooperating processes across processors. Of particular importance is the blocking algorithm which decides the action of a process waiting for a communication or synchronization event to complete. Through simulation of bulk-synchronous parallel applications, we find that a simple two-phase fixed-spin blocking algorithm performs well; a two-phase adaptive algorithm that gathers run-time data on barrier wait-times performs slightly better. Our results hold for a range of machine parameters and parallel program characteristics. These findings are in direct contrast to the literature that states explicit coscheduling is necessary for fine-grained programs. We show that the choice of the local scheduler is crucial, with a priority-based scheduler performing two to three times better than a round-robin scheduler. Overall, we find that the performance of implicit scheduling is near that of coscheduling (+/- 35%), without the requirement of explicit, global coordination.