Exploiting process lifetime distributions for dynamic load balancing
ACM Transactions on Computer Systems (TOCS)
SIGMETRICS '98/PERFORMANCE '98 Proceedings of the 1998 ACM SIGMETRICS joint international conference on Measurement and modeling of computer systems
On choosing a task assignment policy for a distributed server system
Journal of Parallel and Distributed Computing - Special issue on software support for distributed computing
Impact of job mix on optimizations for space sharing schedulers
Supercomputing '96 Proceedings of the 1996 ACM/IEEE conference on Supercomputing
Implementing Multiprocessor Scheduling Disciplines
IPPS '97 Proceedings of the Job Scheduling Strategies for Parallel Processing
A Historical Application Profiler for Use by Parallel Schedulers
IPPS '97 Proceedings of the Job Scheduling Strategies for Parallel Processing
Theory and Practice in Parallel Job Scheduling
IPPS '97 Proceedings of the Job Scheduling Strategies for Parallel Processing
A parallel workload model and its implications for processor allocation
HPDC '97 Proceedings of the 6th IEEE International Symposium on High Performance Distributed Computing
Load profiling: a methodology for scheduling real-time tasks in a distributed system
ICDCS '97 Proceedings of the 17th International Conference on Distributed Computing Systems (ICDCS '97)
Task Assignment with Unknown Duration
ICDCS '00 Proceedings of the The 20th International Conference on Distributed Computing Systems ( ICDCS 2000)
Future Generation Computer Systems - Parallel input/output management techniques (PIOMT) in cluster and grid computing
Task assignment with work-conserving migration
Parallel Computing
The XtreemOS jScheduler: using self-scheduling techniques in large computing architectures
LASCO'08 First USENIX Workshop on Large-Scale Computing
Analysis of size interval task assignment policies
ACM SIGMETRICS Performance Evaluation Review
A decentralized model for scheduling independent tasks in Federated Grids
Future Generation Computer Systems
Surprising results on task assignment in server farms with high-variability workloads
Proceedings of the eleventh international joint conference on Measurement and modeling of computer systems
Performance Evaluation
Future Generation Computer Systems - Parallel input/output management techniques (PIOMT) in cluster and grid computing
A job self-scheduling policy for HPC infrastructures
JSSPP'07 Proceedings of the 13th international conference on Job scheduling strategies for parallel processing
Why segregating short jobs from long jobs under high variability is not always a win
Allerton'09 Proceedings of the 47th annual Allerton conference on Communication, control, and computing
SLA-tree: a framework for efficiently supporting SLA-based decisions in cloud computing
Proceedings of the 14th International Conference on Extending Database Technology
Review: Task assignment policies in distributed server systems: A survey
Journal of Network and Computer Applications
M/M/1-PS queue and size-aware task assignment
Performance Evaluation
The price of forgetting in parallel and non-observable queues
Performance Evaluation
GRUBER: a grid resource usage SLA broker
Euro-Par'05 Proceedings of the 11th international Euro-Par conference on Parallel Processing
Design, implementation, and performance of a load balancer for SIP server clusters
IEEE/ACM Transactions on Networking (TON)
Mathematics of Operations Research
Hi-index | 0.00 |
While the MPP is still the most common architecture in supercomputer centers today, a simpler and cheaper machine configuration is appearing at many supercomputing sites. This alternative setup may be described simply as a collection of multiprocessors or a distributed server system. This collection of multiprocessors is fed by a single common stream of jobs, where each job is dispatched to exactly one of the multiprocessor machines for processing.The biggest question which arises in such distributed server systems is what is a good rule for assigning jobs to host machines: i.e. what is a good task assignment policy. Many task assignment policies have been proposed, but not systematically evaluated under supercomputing workloads.In this paper we start by comparing existing task assignment policies using a trace-driven simulation under supercomputing workloads. We validate our experiments by providing analytical proofs of the performance of each of these policies. These proofs also help provide much intuition. We find that while the performance of supercomputing servers varies widely with the task assignment policy, none of the above task assignment policies perform as well as we would like.We observe that all policies proposed thus far aim to balance load among the hosts. We propose a policy which purposely unbalances load among the hosts, yet, counter-to-intuition, is also fair in that it achieves the same expected slowdown for all jobs – thus no jobs are biased against. We evaluate this policy again using both trace-driven simulation and analysis. We find that the performance of the load unbalancing policy is significantly better than the best of those policies which balance load.