A Strongly Polynomial Algorithm for Controlled Queues

Authors:
Alexander Zadorojniy;Guy Even;Adam Shwartz
Affiliations:
School of Electrical Engineering, Tel-Aviv University, Tel-Aviv 69978, Israel;School of Electrical Engineering, Tel-Aviv University, Tel-Aviv 69978, Israel;Department of Electrical Engineering,Technion, Haifa 32000, Israel
Venue:
Mathematics of Operations Research
Year:
2009

Citing 10
Cited 0

Matrix analysis

Matrix analysis
Stochastic modelling and analysis: a computational approach

Stochastic modelling and analysis: a computational approach
Theory of linear and integer programming

Theory of linear and integer programming
Matrix analysis and applied linear algebra

Matrix analysis and applied linear algebra
Markov Decision Processes: Discrete Stochastic Dynamic Programming

Markov Decision Processes: Discrete Stochastic Dynamic Programming
Introduction to Probability Models, Ninth Edition

Introduction to Probability Models, Ninth Edition
A New Complexity Result on Solving the Markov Decision Problem

Mathematics of Operations Research
On the complexity of policy iteration

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
On the complexity of solving Markov decision problems

UAI'95 Proceedings of the Eleventh conference on Uncertainty in artificial intelligence
Survey A survey of computational complexity results in systems and control

Automatica (Journal of IFAC)

Quantified Score

Hi-index	0.00

Visualization

Abstract

We consider the problem of computing optimal policies of finite-state finite-action Markov decision processes (MDPs). A reduction to a continuum of constrained MDPs (CMDPs) is presented such that the optimal policies for these CMDPs constitute a path in a graph defined over the deterministic policies. This path contains, in particular, an optimal policy of the original MDP. We present an algorithm based on this new approach that finds this path, and thus an optimal policy. In the general case, this path might be exponentially long in the number of states and actions. We prove that the length of this path is polynomial if the MDP satisfies a coupling property. Thus we obtain a strongly polynomial algorithm for MDP s that satisfies the coupling property. We prove that discrete time versions of controlled M/M/1 queues induce MDP s that satisfy the coupling property. The only previously known polynomial algorithm for controlled M/M/1 queues in the expected average cost model is based on linear programming (and is not known to be strongly polynomial). Our algorithm works both for the discounted and expected average cost models, and the running time does not depend on the discount factor.