Scheduling HPC workflows for responsiveness and fairness with networking delays and inaccurate estimates of execution times

  • Authors:
  • Andrew Burkimsher;Iain Bate;Leandro Soares Indrusiak

  • Affiliations:
  • Department of Computer Science, University of York, York, UK;Department of Computer Science, University of York, York, UK;Department of Computer Science, University of York, York, UK

  • Venue:
  • Euro-Par'13 Proceedings of the 19th international conference on Parallel Processing
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

High-Performance Computing systems (HPCs) have grown in popularity in recent years, especially in the form of Grid and Cloud platforms. These platforms may be subject to periods of overload. In our previous research, we found that the Projected-SLR list scheduling policy provides responsiveness and a starvation-free scheduling guarantee in a realistic HPC scenario. This paper extends the previous work to consider networking delays in the platform model and inaccurate estimates of execution times in the application model. P-SLR is shown to be competitive with the best alternative scheduling policies in the presence of network costs (up to 400% computation time) and where execution time estimate inaccuracies are within generous error bounds (