Heavy traffic resource pooling in parallel-server systems

Authors:
J. Michael Harrison;Marcel J. López
Affiliations:
Graduate School of Business, Stanford University, Stanford, CA 94305, USA E-mail: harrison_michael@gsb.stanford.edu;Graduate School of International Relations and Pacific Studies, University of California, San Diego, La Jolla, CA 92093-0519, USA E-mail: martylopez@ucsd.edu
Venue:
Queueing Systems: Theory and Applications
Year:
1999

Citing 8
Cited 25

Scheduling networks of queues: heavy traffic analysis of a simple open network

Queueing Systems: Theory and Applications
Routing and singular control for queueing networks in heavy traffic

SIAM Journal on Control and Optimization
Scheduling networks of queues: heavy traffic analysis of a two-station network with controllable inputs

Operations Research
Brownian networks with discretionary routing

Operations Research
Scheduling networks of queues: heavy traffic analysis of a multistation network with controllable inputs

Operations Research - Supplement to Operations Research: stochastic processes
Numerical methods for stochastic control problems in continuous time

Numerical methods for stochastic control problems in continuous time
Scheduling networks of queues: heavy traffic analysis of a multistation closed network

Operations Research
Introduction to Linear Optimization

Introduction to Linear Optimization

Critical Thresholds for Dynamic Routing in Queueing Networks

Queueing Systems: Theory and Applications
Two Workload Properties for Brownian Networks

Queueing Systems: Theory and Applications
Optimal Routing In Output-Queued Flexible Server Systems

Probability in the Engineering and Informational Sciences
Dynamic Routing and Admission Control in High-Volume Service Systems: Asymptotic Analysis via Multi-Scale Fluid Limits

Queueing Systems: Theory and Applications
Stochastic analysis of multiserver systems

ACM SIGMETRICS Performance Evaluation Review
Managing Response Time in a Call-Routing Problem with Service Failure

Operations Research
Limited choice and locality considerations for load balancing

Performance Evaluation
Heavy traffic analysis of maximum pressure policies for stochastic processing networks with multiple bottlenecks

Queueing Systems: Theory and Applications
The Value of Partial Resource Pooling: Should a Service Network Be Integrated or Product-Focused?

Management Science
Dynamic Control of a Make-to-Order, Parallel-Server System with Cancellations

Operations Research
Simplified Control Problems for Multiclass Many-Server Queueing Systems

Mathematics of Operations Research
Dynamic Control of N-Systems with Many Servers: Asymptotic Optimality of a Static Priority Policy in Heavy Traffic

Operations Research
Asymptotically optimal parallel resource assignment with interference

Queueing Systems: Theory and Applications
Heavy traffic analysis of state-dependent parallel queues with triggers and an application to web search systems

Performance Evaluation
Control of systems with flexible multi-server pools: a shadow routing approach

Queueing Systems: Theory and Applications
Throughput maximization for two station tandem systems: a proof of the Andradóttir---Ayhan conjecture

Queueing Systems: Theory and Applications
Dynamic server allocation for unstable queueing networks with flexible servers

Queueing Systems: Theory and Applications
Asymptotic Optimality of Balanced Routing

Operations Research
A Stochastic Network Under Proportional Fair Resource Control---Diffusion Limit with Multiple Bottlenecks

Operations Research
Asymptotically tight steady-state queue length bounds implied by drift conditions

Queueing Systems: Theory and Applications
Heavy traffic optimal resource allocation algorithms for cloud computing clusters

Proceedings of the 24th International Teletraffic Congress
Product mix optimization for a semiconductor fab: modeling approaches and decomposition techniques

Proceedings of the Winter Simulation Conference
A Little Flexibility Is All You Need: On the Asymptotic Value of Flexible Capacity in Parallel Queuing Systems

Operations Research
A Little Flexibility Is All You Need: On the Asymptotic Value of Flexible Capacity in Parallel Queuing Systems

Operations Research
Queueing system topologies with limited flexibility

Proceedings of the ACM SIGMETRICS/international conference on Measurement and modeling of computer systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

We consider a queueing system with r non-identical servers working in parallel, exogenous arrivals into m different job classes, and linear holding costs for each class. Each arrival requires a single service, which may be provided by any of several different servers in our general formulation; the service time distribution depends on both the job class being processed and the server selected. The system manager seeks to minimize holding costs by dynamically scheduling waiting jobs onto available servers. A linear program involving only first-moment data (average arrival rates and mean service times) is used to define heavy traffic for a system of this form, and also to articulate a condition of overlapping server capabilities which leads to resource pooling in the heavy traffic limit. Assuming that the latter condition holds, we rescale time and state space in standard fashion, then identify a Brownian control problem that is the formal heavy traffic limit of our rescaled scheduling problem. Because of the assumed overlap in server capabilities, the limiting Brownian control problem is effectively one-dimensional, and it admits a pathwise optimal solution. That is, in the limiting Brownian control problem the multiple servers of our original model merge to form a single pool of service capacity, and there exists a dynamic control policy which minimizes cumulative cost incurred up to any time t with probability one. Interpreted in our original problem context, the Brownian solution suggests the following: virtually all backlogged work should be held in one particular job class, and all servers can and should be productively employed except when the total backlog is small. It is conjectured that such ideal system behavior can be approached using a family of relatively simple scheduling policies related to the c\mu rule.