A New Complexity Result on Solving the Markov Decision Problem

Authors:
Yinyu Ye
Affiliations:
Department of Management Science and Engineering, Terman 316, Stanford University, Stanford, California 94305
Venue:
Mathematics of Operations Research
Year:
2005

Citing 12
Cited 6

Combinatorial optimization: algorithms and complexity

Combinatorial optimization: algorithms and complexity
A polynomial-time algorithm for a class of linear complementary problems

Mathematical Programming: Series A and B
Linear programming, complexity theory and elementary functional analysis

Mathematical Programming: Series A and B
A primal-dual interior point method whose running time depends only on the constraint matrix

Mathematical Programming: Series A and B
Complexity and real computation

Complexity and real computation
Interior point algorithms: theory and analysis

Interior point algorithms: theory and analysis
A modified layered-step interior-point algorithm for linear programming

Mathematical Programming: Series A and B
A Variant of the Vavasis-Ye Layered-Step Interior-Point Algorithm for Linear Programming

SIAM Journal on Optimization
Learning and value function approximation in complex decision processes

Learning and value function approximation in complex decision processes
A New Iteration-Complexity Bound for the MTY Predictor-Corrector Algorithm

SIAM Journal on Optimization
On the complexity of policy iteration

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
On the complexity of solving Markov decision problems

UAI'95 Proceedings of the Eleventh conference on Uncertainty in artificial intelligence

Discounted deterministic Markov decision processes and discounted all-pairs shortest paths

SODA '09 Proceedings of the twentieth Annual ACM-SIAM Symposium on Discrete Algorithms
A Strongly Polynomial Algorithm for Controlled Queues

Mathematics of Operations Research
Discounted deterministic Markov decision processes and discounted all-pairs shortest paths

ACM Transactions on Algorithms (TALG)
The Simplex and Policy-Iteration Methods Are Strongly Polynomial for the Markov Decision Problem with a Fixed Discount Rate

Mathematics of Operations Research
Strategy Iteration Is Strongly Polynomial for 2-Player Turn-Based Stochastic Games with a Constant Discount Factor

Journal of the ACM (JACM)
Robust Markov Decision Processes

Mathematics of Operations Research

Quantified Score

Hi-index	0.01

Visualization

Abstract

We present a new complexity result on solving the Markov decision problem (MDP) with n states and a number of actions for each state, a special class of real-number linear programs with the Leontief matrix structure. We prove that when the discount factor Î赂 is strictly less than 1, the problem can be solved in at most O(n1.5(log 1/(1 - Î赂)+log n)) classical interior-point method iterations and O(n4(log 1/(1 - Î赂)+log n)) arithmetic operations. Our method is a combinatorial interior-point method related to the work of Ye (1990. A "build-down" scheme for linear programming. Math. Programming46 61-72) and Vavasis and Ye (1996. A primal-dual interior-point method whose running time depends only on the constraint matrix. Math. Programming74 79-120). To our knowledge, this is the first strongly polynomial-time algorithm for solving the MDP when the discount factor is a constant less than 1.