Fast convergence to state-action frequency polytopes for MDPs

Authors:
Mathieu Tracol
Affiliations:
LRI, University Paris-Sud, France
Venue:
Operations Research Letters
Year:
2009

Citing 4
Cited 0

Rate of convergence of empirical measures and costs in controlled Markov chains and transient optimality

Mathematics of Operations Research
Markov Decision Processes: Discrete Stochastic Dynamic Programming

Markov Decision Processes: Discrete Stochastic Dynamic Programming
Finite State Markovian Decision Processes

Finite State Markovian Decision Processes
On the Empirical State-Action Frequencies in Markov Decision Processes Under General Policies

Mathematics of Operations Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

In the context of finite weakly communicating Markov Decision Processes, we tackle the problem of fast convergence of state-action frequency vectors to the polytope of stationary distributions on state-action frequencies. Using unichain policies, we derive bounds on the speed of convergence which are independent of the limit points.