Adaptive Metric-Aware Job Scheduling for Production Supercomputers

  • Authors:
  • Wei Tang;Dongxu Ren;Zhiling Lan;Narayan Desai

  • Affiliations:
  • -;-;-;-

  • Venue:
  • ICPPW '12 Proceedings of the 2012 41st International Conference on Parallel Processing Workshops
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Job scheduling is a critical and complex task on large-scale supercomputers where a scheduling policy is expected to fulfill amorphous and sometimes conflicting goals from both users and system owners. Moreover, the effectiveness of a scheduling policy is dependent on workload characteristics which vary from time to time. Thus it is challenging to design a versatile scheduling policy that is effective in all circumstances. To address this issue, we propose an adaptive metric-aware job scheduling strategy. First, we propose metric-aware scheduling which enables the scheduler to balance competing scheduling goals represented by different metrics such as job waiting time, fairness, and system utilization. Second, we enhance the scheduler to adaptively adjust scheduling policies based on feedback information of monitored metrics at runtime. We evaluate our design using real workloads from supercomputer centers and demonstrate that our scheduling mechanism can significantly improve system performance in a balanced, sustainable fashion.