IEEE Transactions on Parallel and Distributed Systems
Core Algorithms of the Maui Scheduler
JSSPP '01 Revised Papers from the 7th International Workshop on Job Scheduling Strategies for Parallel Processing
Selective Reservation Strategies for Backfill Job Scheduling
JSSPP '02 Revised Papers from the 8th International Workshop on Job Scheduling Strategies for Parallel Processing
Scheduling Jobs on Parallel Systems Using a Relaxed Backfill Strategy
JSSPP '02 Revised Papers from the 8th International Workshop on Job Scheduling Strategies for Parallel Processing
A Self-Tuning Job Scheduler Family with Dynamic Policy Switching
JSSPP '02 Revised Papers from the 8th International Workshop on Job Scheduling Strategies for Parallel Processing
Design and Evaluation of a Feedback Control EDF Scheduling Algorithm
RTSS '99 Proceedings of the 20th IEEE Real-Time Systems Symposium
Improving Parallel Job Scheduling by Combining Gang Scheduling and Backfilling Techniques
IPDPS '00 Proceedings of the 14th International Symposium on Parallel and Distributed Processing
A resource-allocation queueing fairness measure
Proceedings of the joint international conference on Measurement and modeling of computer systems
Job Fairness in Non-Preemptive Job Scheduling
ICPP '04 Proceedings of the 2004 International Conference on Parallel Processing
Backfilling with lookahead to optimize the packing of parallel jobs
Journal of Parallel and Distributed Computing
Aspects of a Dynamically Adaptive Operating System
IEEE Transactions on Computers
Petascale system management experiences
LISA'08 Proceedings of the 22nd conference on Large installation system administration conference
PV-EASY: a strict fairness guaranteed and prediction enabled scheduler in parallel job scheduling
Proceedings of the 19th ACM International Symposium on High Performance Distributed Computing
Towards automated HPC scheduler configuration tuning
Concurrency and Computation: Practice & Experience
Reducing Fragmentation on Torus-Connected Supercomputers
IPDPS '11 Proceedings of the 2011 IEEE International Parallel & Distributed Processing Symposium
Adaptive Metric-Aware Job Scheduling for Production Supercomputers
ICPPW '12 Proceedings of the 2012 41st International Conference on Parallel Processing Workshops
Hi-index | 0.00 |
Job scheduling on production supercomputers is complicated by diverse demands of system administrators and amorphous characteristics of workloads. Specifically, various scheduling goals such as queuing efficiency and system utilization are usually conflicting and thus need to be balanced. Also, changing workload characteristics often impact the effectiveness of the deployed scheduling policies. Thus it is challenging to design a versatile scheduling policy that is effective in all circumstances. In this paper, we propose a novel job scheduling strategy to balance diverse scheduling goals and mitigate the impact of workload characteristics. First, we introduce metric-aware scheduling, which enables the scheduler to balance competing scheduling goals represented by different metrics such as job waiting time, fairness, and system utilization. Second, we design a scheme to dynamically adjust scheduling policies based on feedback information of monitored metrics at runtime. We evaluate our design using real workloads from supercomputer centers. The results demonstrate that our scheduling mechanism can significantly improve system performance in a balanced, sustainable fashion.