Heavy traffic resource pooling in parallel-server systems

  • Authors:
  • J. Michael Harrison;Marcel J. López

  • Affiliations:
  • Graduate School of Business, Stanford University, Stanford, CA 94305, USA E-mail: harrison_michael@gsb.stanford.edu;Graduate School of International Relations and Pacific Studies, University of California, San Diego, La Jolla, CA 92093-0519, USA E-mail: martylopez@ucsd.edu

  • Venue:
  • Queueing Systems: Theory and Applications
  • Year:
  • 1999

Quantified Score

Hi-index 0.00

Visualization

Abstract

We consider a queueing system with r non-identical servers working in parallel, exogenous arrivals into m different job classes, and linear holding costs for each class. Each arrival requires a single service, which may be provided by any of several different servers in our general formulation; the service time distribution depends on both the job class being processed and the server selected. The system manager seeks to minimize holding costs by dynamically scheduling waiting jobs onto available servers. A linear program involving only first-moment data (average arrival rates and mean service times) is used to define heavy traffic for a system of this form, and also to articulate a condition of overlapping server capabilities which leads to resource pooling in the heavy traffic limit. Assuming that the latter condition holds, we rescale time and state space in standard fashion, then identify a Brownian control problem that is the formal heavy traffic limit of our rescaled scheduling problem. Because of the assumed overlap in server capabilities, the limiting Brownian control problem is effectively one-dimensional, and it admits a pathwise optimal solution. That is, in the limiting Brownian control problem the multiple servers of our original model merge to form a single pool of service capacity, and there exists a dynamic control policy which minimizes cumulative cost incurred up to any time t with probability one. Interpreted in our original problem context, the Brownian solution suggests the following: virtually all backlogged work should be held in one particular job class, and all servers can and should be productively employed except when the total backlog is small. It is conjectured that such ideal system behavior can be approached using a family of relatively simple scheduling policies related to the c\mu rule.