Exponential lower bounds for policy iteration

  • Authors:
  • John Fearnley

  • Affiliations:
  • Department of Computer Science, University of Warwick, UK

  • Venue:
  • ICALP'10 Proceedings of the 37th international colloquium conference on Automata, languages and programming: Part II
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

We study policy iteration for infinite-horizon Markov decision processes. It has recently been shown policy iteration style algorithms have exponential lower bounds in a two player game setting. We extend these lower bounds to Markov decision processes with the total reward and average-reward optimality criteria.