Fast convergence to state-action frequency polytopes for MDPs

  • Authors:
  • Mathieu Tracol

  • Affiliations:
  • LRI, University Paris-Sud, France

  • Venue:
  • Operations Research Letters
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

In the context of finite weakly communicating Markov Decision Processes, we tackle the problem of fast convergence of state-action frequency vectors to the polytope of stationary distributions on state-action frequencies. Using unichain policies, we derive bounds on the speed of convergence which are independent of the limit points.