Deciding which queue to join: Some counterexamples
Operations Research
SIAM Journal on Computing
The Power of Two Choices in Randomized Load Balancing
IEEE Transactions on Parallel and Distributed Systems
Queueing Systems: Theory and Applications
State space collapse with application to heavy traffic limits for multiclass queueing networks
Queueing Systems: Theory and Applications
Heavy traffic resource pooling in parallel-server systems
Queueing Systems: Theory and Applications
Optimal Routing In Output-Queued Flexible Server Systems
Probability in the Engineering and Informational Sciences
Dynamic Routing in Large-Scale Service Systems with Heterogeneous Servers
Queueing Systems: Theory and Applications
Discrete-Event Control of Stochastic Networks: Multimodularity and Regularity (Lecture Notes in Mathematics)
Analysis of join-the-shortest-queue routing for web server farms
Performance Evaluation
Limited choice and locality considerations for load balancing
Performance Evaluation
Hi-index | 0.00 |
Consider a system with K parallel servers, each with its own waiting room. Upon arrival, a job is routed to the queue of one of the servers. Finding a routing policy that minimizes the total workload in the system is a known difficult problem in general. Even if the optimal policy is identified, the policy would require the full queue length information at the arrival of each job; for example, the join-the-shortest-queue policy (which is known to be optimal for identical servers with exponentially distributed service times) would require comparing the queue lengths of all the servers. In this paper, we consider a balanced routing policy that examines only a subset of c servers, with 1 ≤ c ≤ K: specifically, upon the arrival of a job, choose a subset of c servers with a probability proportional to their service rates, and then route the job to the one with the shortest queue among the c chosen servers. Under such a balanced policy, we derive the diffusion limits of the queue length processes and the workload processes. We note that the diffusion limits are the same for these processes regardless the choice of c, as long as c ≥ 2. We further show that the proposed balanced routing policy for any fixed c ≥ 2 is asymptotically optimal in the sense that it minimizes the workload over all time in the diffusion limit. In addition, the policy helps to distribute work among all the servers evenly.