Approximate Linear Programming for Average Cost MDPs

  • Authors:
  • Michael H. Veatch

  • Affiliations:
  • Department of Mathematics, Gordon College, Wenham, Massachusetts 01984

  • Venue:
  • Mathematics of Operations Research
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

We consider the linear programming approach to approximate dynamic programming with an average cost objective and a finite state space. Using a Lagrangian form of the linear program LP, the average cost error is shown to be a multiple of the best fit differential cost error. This result is analogous to previous error bounds for a discounted cost objective. Second, bounds are derived for average cost error and performance of the policy generated from the LP that involve the mixing time of the Markov decision process MDP under this policy or the optimal policy. These results improve on a previous performance bound involving mixing times.