The limited performance benefits of migrating active processes for load sharing
SIGMETRICS '88 Proceedings of the 1988 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Exploiting process lifetime distributions for dynamic load balancing
ACM Transactions on Computer Systems (TOCS)
Self-similarity in World Wide Web traffic: evidence and possible causes
IEEE/ACM Transactions on Networking (TON)
Load-balancing heuristics and process behavior
SIGMETRICS '86/PERFORMANCE '86 Proceedings of the 1986 ACM SIGMETRICS joint international conference on Computer performance modelling, measurement and evaluation
How Useful Is Old Information?
IEEE Transactions on Parallel and Distributed Systems
The Power of Two Choices in Randomized Load Balancing
IEEE Transactions on Parallel and Distributed Systems
Task assignment with unknown duration
Journal of the ACM (JACM)
The MOSIX Distributed Operating System: Load Balancing for UNIX
The MOSIX Distributed Operating System: Load Balancing for UNIX
Analysis of cycle stealing with switching cost
SIGMETRICS '03 Proceedings of the 2003 ACM SIGMETRICS international conference on Measurement and modeling of computer systems
On Choosing a Task Assignment Policy for a Distributed Server System
On Choosing a Task Assignment Policy for a Distributed Server System
A dynamic load distribution strategy for systems under high task variation and heavy traffic
Proceedings of the 2003 ACM symposium on Applied computing
Theory, Volume 1, Queueing Systems
Theory, Volume 1, Queueing Systems
Autopilot: automatic data center management
ACM SIGOPS Operating Systems Review - Systems work at Microsoft Research
Communications of the ACM - Web science
On cost-aware monitoring for self-adaptive load sharing
IEEE Journal on Selected Areas in Communications
Join-Idle-Queue: A novel load balancing algorithm for dynamically scalable web services
Performance Evaluation
Hi-index | 0.00 |
Load balancing in large distributed server systems is a complex optimization problem of critical importance in cloud systems and data centers. However, any full (i.e., optimal) solution incurs significant, often prohibitive, overhead due to the need to collect state-dependent information. We propose a novel scheme that incurs no communication overhead between the users and the servers upon job arrivals, thus removing any scheduling overhead from the job execution's critical path. Furthermore, our scheme is oblivious, that is, it does not use any state information. Our approach is based on creating, in addition to the regular job requests that are assigned to randomly chosen servers, also replicas that are sent to different servers; these replicas are served in low priority, such that they do not add any real burden on the servers. Through analysis and simulations we show that the expected system performance improves up to a factor of 2 (even under high load conditions), if job lengths are exponentially distributed, and over a factor of 5, when job lengths adhere to heavy-tailed distributions. We implemented a load balancing system based on our approach and deployed it on the Amazon Elastic Compute Cloud (EC2). Realistic load tests on that system indicate that the actual performance is as predicted.