An analysis of reinforcement learning with function approximation

  • Authors:
  • Francisco S. Melo;Sean P. Meyn;M. Isabel Ribeiro

  • Affiliations:
  • Carnegie Mellon University, Pittsburgh, PA;Coordinated Science Lab, Urbana, IL;Institute for Systems and Robotics, Lisboa, Portugal

  • Venue:
  • Proceedings of the 25th international conference on Machine learning
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

We address the problem of computing the optimal Q-function in Markov decision problems with infinite state-space. We analyze the convergence properties of several variations of Q-learning when combined with function approximation, extending the analysis of TD-learning in (Tsitsiklis & Van Roy, 1996a) to stochastic control settings. We identify conditions under which such approximate methods converge with probability 1. We conclude with a brief discussion on the general applicability of our results and compare them with several related works.