Evaluation of Task Assignment Policies for Supercomputing Servers: The Case for Load Unbalancing and Fairness

Authors:
Bianca Schroeder;Mor Harchol-Balter
Affiliations:
School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA;School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
Venue:
Cluster Computing
Year:
2004

Citing 10
Cited 17

Exploiting process lifetime distributions for dynamic load balancing

ACM Transactions on Computer Systems (TOCS)
Task assignment in a distributed system (extended abstract): improving performance by unbalancing load

SIGMETRICS '98/PERFORMANCE '98 Proceedings of the 1998 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
On choosing a task assignment policy for a distributed server system

Journal of Parallel and Distributed Computing - Special issue on software support for distributed computing
Impact of job mix on optimizations for space sharing schedulers

Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
Implementing Multiprocessor Scheduling Disciplines

IPPS '97 Proceedings of the Job Scheduling Strategies for Parallel Processing
A Historical Application Profiler for Use by Parallel Schedulers

IPPS '97 Proceedings of the Job Scheduling Strategies for Parallel Processing
Theory and Practice in Parallel Job Scheduling

IPPS '97 Proceedings of the Job Scheduling Strategies for Parallel Processing
A parallel workload model and its implications for processor allocation

HPDC '97 Proceedings of the 6th IEEE International Symposium on High Performance Distributed Computing
Load profiling: a methodology for scheduling real-time tasks in a distributed system

ICDCS '97 Proceedings of the 17th International Conference on Distributed Computing Systems (ICDCS '97)
Task Assignment with Unknown Duration

ICDCS '00 Proceedings of the The 20th International Conference on Distributed Computing Systems ( ICDCS 2000)

Examination of load-balancing methods to improve efficiency of a composite materials manufacturing process simulation under uncertainty using distributed computing

Future Generation Computer Systems - Parallel input/output management techniques (PIOMT) in cluster and grid computing
Task assignment with work-conserving migration

Parallel Computing
The XtreemOS jScheduler: using self-scheduling techniques in large computing architectures

LASCO'08 First USENIX Workshop on Large-Scale Computing
Analysis of size interval task assignment policies

ACM SIGMETRICS Performance Evaluation Review
A decentralized model for scheduling independent tasks in Federated Grids

Future Generation Computer Systems
Surprising results on task assignment in server farms with high-variability workloads

Proceedings of the eleventh international joint conference on Measurement and modeling of computer systems
Analysis of SITA policies

Performance Evaluation
Examination of load-balancing methods to improve efficiency of a composite materials manufacturing process simulation under uncertainty using distributed computing

Future Generation Computer Systems - Parallel input/output management techniques (PIOMT) in cluster and grid computing
A job self-scheduling policy for HPC infrastructures

JSSPP'07 Proceedings of the 13th international conference on Job scheduling strategies for parallel processing
Why segregating short jobs from long jobs under high variability is not always a win

Allerton'09 Proceedings of the 47th annual Allerton conference on Communication, control, and computing
SLA-tree: a framework for efficiently supporting SLA-based decisions in cloud computing

Proceedings of the 14th International Conference on Extending Database Technology
Review: Task assignment policies in distributed server systems: A survey

Journal of Network and Computer Applications
M/M/1-PS queue and size-aware task assignment

Performance Evaluation
The price of forgetting in parallel and non-observable queues

Performance Evaluation
GRUBER: a grid resource usage SLA broker

Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing
Design, implementation, and performance of a load balancer for SIP server clusters

IEEE/ACM Transactions on Networking (TON)
Steady-State Analysis for Multiserver Queues Under Size Interval Task Assignment in the Quality-Driven Regime

Mathematics of Operations Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

While the MPP is still the most common architecture in supercomputer centers today, a simpler and cheaper machine configuration is appearing at many supercomputing sites. This alternative setup may be described simply as a collection of multiprocessors or a distributed server system. This collection of multiprocessors is fed by a single common stream of jobs, where each job is dispatched to exactly one of the multiprocessor machines for processing.The biggest question which arises in such distributed server systems is what is a good rule for assigning jobs to host machines: i.e. what is a good task assignment policy. Many task assignment policies have been proposed, but not systematically evaluated under supercomputing workloads.In this paper we start by comparing existing task assignment policies using a trace-driven simulation under supercomputing workloads. We validate our experiments by providing analytical proofs of the performance of each of these policies. These proofs also help provide much intuition. We find that while the performance of supercomputing servers varies widely with the task assignment policy, none of the above task assignment policies perform as well as we would like.We observe that all policies proposed thus far aim to balance load among the hosts. We propose a policy which purposely unbalances load among the hosts, yet, counter-to-intuition, is also fair in that it achieves the same expected slowdown for all jobs – thus no jobs are biased against. We evaluate this policy again using both trace-driven simulation and analysis. We find that the performance of the load unbalancing policy is significantly better than the best of those policies which balance load.