Scheduling Divisible Loads in Parallel and Distributed Systems
Scheduling Divisible Loads in Parallel and Distributed Systems
Benchmarks and Standards for the Evaluation of Parallel Job Schedulers
IPPS/SPDP '99/JSSPP '99 Proceedings of the Job Scheduling Strategies for Parallel Processing
Scheduling Jobs on Parallel Systems Using a Relaxed Backfill Strategy
JSSPP '02 Revised Papers from the 8th International Workshop on Job Scheduling Strategies for Parallel Processing
Approximating Multiobjective Knapsack Problems
Management Science
Analysis of a multiobjective evolutionary algorithm on the 0-1 knapsack problem
Theoretical Computer Science
Sensitivity analysis of knapsack-based task scheduling on the grid
Proceedings of the 20th annual international conference on Supercomputing
MapReduce: simplified data processing on large clusters
Communications of the ACM - 50th anniversary issue: 1958 - 2008
Falkon: a Fast and Light-weight tasK executiON framework
Proceedings of the 2007 ACM/IEEE conference on Supercomputing
Toward loosely coupled programming on petascale systems
Proceedings of the 2008 ACM/IEEE conference on Supercomputing
Toward a fully decentralized algorithm for multiple bag-of-tasks application scheduling on grids
GRID '08 Proceedings of the 2008 9th IEEE/ACM International Conference on Grid Computing
Scheduling for Parallel Processing
Scheduling for Parallel Processing
Scheduling Concurrent Bag-of-Tasks Applications on Heterogeneous Platforms
IEEE Transactions on Computers
Dynamic proportional share scheduling in Hadoop
JSSPP'10 Proceedings of the 15th international conference on Job scheduling strategies for parallel processing
A MapReduce workflow system for architecting scientific data intensive applications
Proceedings of the 2nd International Workshop on Software Engineering for Cloud Computing
A Hybrid Scheduling Algorithm for Data Intensive Workloads in a MapReduce Environment
UCC '12 Proceedings of the 2012 IEEE/ACM Fifth International Conference on Utility and Cloud Computing
Hi-index | 0.00 |
We have developed an efficient single queue scheduling system that utilizes a greedy knapsack algorithm with dynamic job priorities. Our scheduler satisfies high level objectives while maintaining high utilization of the HPC system or collection of distributed resources such as a computational GRID. We provide simulation analysis of our approach in contrast with various scheduling strategies of shortest job first; longest waiting jobs first; and large jobs first. Further, we look at the effects of system size on the total workload response time and find that for real workloads, the relationship between response time and system size follows an inverse power law. Our approach does not require system administrators or users to identify a specific priority queue for each of their jobs. The proposed scheduler performs an exhaustive parameter search for a priority calculation per job to balance high level objectives and provide guaranteed performance jobs in a workload. The system administrator needs only tune the prioritization parameters (knobs) and the system scheduler will behave accordingly, such as reducing wait time for jobs that are above average size with small runtimes. We demonstrate that our approach works very well on workloads that have many independent tasks. We evaluate our scheduler on a realistic mixed scientific data processing workload and with a realistic HPC workload trace from the parallel workloads archive.