Backfilling with lookahead to optimize the packing of parallel jobs

  • Authors:
  • Edi Shmueli;Dror G. Feitelson

  • Affiliations:
  • Department of Computer Science, Haifa University, Haifa, Israel and IBM Haifa Research Laboratory, Israel;School of Computer Science and Engineering, Hebrew University, Jerusalem, Israel

  • Venue:
  • Journal of Parallel and Distributed Computing
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

The utilization of parallel computers depends on how jobs are packed together: if the jobs are not packed tightly, resources are lost due to fragmentation. The problem is that the goal of high utilization may conflict with goals of fairness or even progress for all jobs. The common solution is to use backfilling, which combines a reservation for the first job in the interest of progress with packing of later jobs to fill in holes and increase utilization. However, backfilling considers the queued jobs one at a time, and thus might miss better packing opportunities. We propose the use of dynamic programming to find the best packing possible given the current composition of the queue, thus maximizing the utilization on every scheduling step. Simulations of this algorithm, called lookahead optimizing scheduler (LOS), using trace files from several IBM SP parallel systems, show that LOS indeed improves utilization, and thereby reduces the mean response time and mean slowdown of all jobs. Moreover, it is actually possible to limit the lookahead depth to about 50 jobs and still achieve essentially the same results. Finally, we experimented with selecting among alternative sets of jobs that achieve the same utilization. Surprising results indicate that choosing the set at the head of the queue does not necessarily guarantee best performance. Instead, repeatedly selecting the set with the maximal overall expected slowdown boosts performance when compared to all other alternatives checked.