Attacking the bottlenecks of backfilling schedulers

  • Authors:
  • Peter J. Keleher;Dmitry Zotkin;Dejan Perkovic

  • Affiliations:
  • -;-;-

  • Venue:
  • Cluster Computing
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

Backfilling is a simple and effective way of improving the utilization of space-sharing schedulers. Simple first-come-first-served approaches are ineffective because large jobs can fragment the available resources. Backfilling schedulers address this problem by allowing jobs to move ahead in the queue, provided that they will not delay subsequent jobs. Previous research has shown that inaccurate estimates of execution times can lead to better backfilling schedules. In the first part of this study, we characterize this effect on several workloads, and show that average slowdowns can be effectively reduced by systematically lengthening estimated execution times. Further, we show that the average job slowdown metric can be addressed directly by sorting jobs by increasing execution time. Finally, we modify our sorting scheduler to ensure that incoming jobs can be given hard guarantees. The resulting scheduler guarantees to avoid starvation, and performs significantly better than previous backfilling schedulers. In the second part of this study, we show how queue randomization and even more a combination of queue randomization and sorting by job length can improve performance. We show that these improvements are better than with queue sorting by job length alone in the simulation with actual estimates of job running times. We investigate the real characteristics of these estimates, and show the wide range of overestimation. To exploit even more randomization and queue sorting, we eliminate guarantees from backfilling algorithm, and show significant improvements. Finally, we show a limited usefulness of these guarantees, and show that queue sorting criteria can be modified to prevent starvation in the modified backfilling algorithm.