Task assignment with unknown duration
Journal of the ACM (JACM)
Multiple-queue backfilling scheduling with priorities and reservations for parallel systems
ACM SIGMETRICS Performance Evaluation Review
SRPT Scheduling for Web Servers
JSSPP '01 Revised Papers from the 7th International Workshop on Job Scheduling Strategies for Parallel Processing
Cycle stealing under immediate dispatch task assignment
Proceedings of the fifteenth annual ACM symposium on Parallel algorithms and architectures
Analysis of Task Assignment with Cycle Stealing under Central Queue
ICDCS '03 Proceedings of the 23rd International Conference on Distributed Computing Systems
Optimizing Static Job Scheduling in a Network of Heterogeneous Computers
ICPP '00 Proceedings of the Proceedings of the 2000 International Conference on Parallel Processing
What is worth learning from parallel workloads?: a user and session based analysis
Proceedings of the 19th annual international conference on Supercomputing
A least flow-time first load sharing approach for distributed server farm
Journal of Parallel and Distributed Computing
Quincy: fair scheduling for distributed computing clusters
Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles
On the inapproximability of M/G/K: why two moments of job size distribution are not enough
Queueing Systems: Theory and Applications
Resolving cross-layer conflict between overlay routing and traffic engineering
IEEE/ACM Transactions on Networking (TON)
To balance or unbalance load in size-interval task allocation
Probability in the Engineering and Informational Sciences
Task assignment based on prioritising traffic flows
OPODIS'04 Proceedings of the 8th international conference on Principles of Distributed Systems
Investigation of data locality and fairness in MapReduce
Proceedings of third international workshop on MapReduce and its Applications Date
Hi-index | 0.00 |
While the MPP is still the most common architecture in supercomputer centers today, a simpler and cheaper machine configuration is appearing at many supercomputing sites. This alternative setup may be described simply as a collection of multiprocessors or a distributed server system. This collection of multiprocessors is fed by a single common stream of jobs, where each job is dispatched to exactly one of the multiprocessor machines for processing. The biggest question, which arises in such distributed server systems, is what is a good rule for assigning jobs to host machines: i.e. what is a good task assignment policy. Many task assignment policies have been proposed, but not systematically evaluated under supercomputing workloads.In this paper, we start by comparing existing task assignment policies using a trace-driven simulation under supercomputing workloads. We validate our experiments by providing analytical proofs of the performance of each of these policies. These proofs also help provide much intuition. We find that while the performance of supercomputing servers varies widely with the task assignment policy, none of the above task assignment policies perform as well as we would like.We observe that all policies proposed thus far aim to balance load among the hosts. We propose a policy which purposely unbalances load among the hosts, yet, counter-to-intuition, is also fair in that it achieves the same expected slowdown for all jobs - thus no jobs are biased against. We evaluate this policy again using both trace-driven simulation and analysis. We find that the performance of the load unbalancing policy is significantly better than the best of those policies which balance load.