Exponential lower bounds for policy iteration

Authors:
John Fearnley
Affiliations:
Department of Computer Science, University of Warwick, UK
Venue:
ICALP'10 Proceedings of the 37th international colloquium conference on Automata, languages and programming: Part II
Year:
2010

Citing 3
Cited 4

Markov Decision Processes: Discrete Stochastic Dynamic Programming

Markov Decision Processes: Discrete Stochastic Dynamic Programming
A Discrete Strategy Improvement Algorithm for Solving Parity Games

CAV '00 Proceedings of the 12th International Conference on Computer Aided Verification
On the complexity of policy iteration

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence

Subexponential lower bounds for randomized pivoting rules for the simplex algorithm

Proceedings of the forty-third annual ACM symposium on Theory of computing
A subexponential lower bound for Zadeh's pivoting rule for solving linear programs and games

IPCO'11 Proceedings of the 15th international conference on Integer programming and combinatoral optimization
The Simplex and Policy-Iteration Methods Are Strongly Polynomial for the Markov Decision Problem with a Fixed Discount Rate

Mathematics of Operations Research
Strategy Iteration Is Strongly Polynomial for 2-Player Turn-Based Stochastic Games with a Constant Discount Factor

Journal of the ACM (JACM)

Quantified Score

Hi-index	0.00

Visualization

Abstract

We study policy iteration for infinite-horizon Markov decision processes. It has recently been shown policy iteration style algorithms have exponential lower bounds in a two player game setting. We extend these lower bounds to Markov decision processes with the total reward and average-reward optimality criteria.