A large-scale service system with packing constraints: minimizing the number of occupied servers

Authors:
Alexander L. Stolyar;Yuan Zhong
Affiliations:
Bell Labs, Alcatel-Lucent, Murray Hill, NJ, USA;University of California, Berkeley, Berkeley, CA, USA
Venue:
Proceedings of the ACM SIGMETRICS/international conference on Measurement and modeling of computer systems
Year:
2013

Citing 5
Cited 0

Stochastic Bandwidth Packing Process: Stability Conditions via Lyapunov Function Technique

Queueing Systems: Theory and Applications
On the Sum-of-Squares algorithm for bin packing

Journal of the ACM (JACM)
A New Approximation Method for Set Covering Problems, with Applications to Multidimensional Bin Packing

SIAM Journal on Computing
Shadow-Routing Based Control of Flexible Multiserver Pools in Overload

Operations Research
Multiclass multiserver queueing system in the Halfin---Whitt heavy traffic regime: asymptotics of the stationary distribution

Queueing Systems: Theory and Applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

We consider a large-scale service system model proposed in [14], which is motivated by the problem of efficient placement of virtual machines to physical host machines in a network cloud, so that the total number of occupied hosts is minimized. Customers of different types arrive to a system with an infinite number of servers. A server packing configuration is the vector k = {ki}, where ki is the number of type-i customers that the server "contains". Packing constraints are described by a fixed finite set of allowed configurations. Upon arrival, each customer is placed into a server immediately, subject to the packing constraints; the server can be idle or already serving other customers. After service completion, each customer leaves its server and the system. It was shown in [14] that a simple real-time algorithm, called Greedy, is asymptotically optimal in the sense of minimizing ∑k Xk1+α in the stationary regime, as the customer arrival rates grow to infinity. (Here α 0, and Xk denotes the number of servers with configuration k.) In particular, when parameter α is small, and in the asymptotic regime where customer arrival rates grow to infinity, Greedy solves a problem approximating one of minimizing ∑k Xk, the number of occupied hosts. In this paper we introduce the algorithm called Greedy with sublinear Safety Stocks (GSS), and show that it asymptotically solves the exact problem of minimizing ∑k Xk. An important feature of the algorithm is that sublinear safety stocks of Xk are created automatically - when and where necessary - without having to determine a priori where they are required. Moreover, we also provide a tight characterization of the rate of convergence to optimality under GSS. The GSS algorithm is as simple as Greedy, and uses no more system state information than Greedy does.