Asynchronous Stochastic Approximation and Q-Learning

  • Authors:
  • John N. Tsitsiklis

  • Affiliations:
  • Laboratory for Information and Decision Systems, Massachusetts Institute of Technology, Cambridge, MA 02139. jnt@athena.mit.edu

  • Venue:
  • Machine Learning
  • Year:
  • 1994

Quantified Score

Hi-index 0.01

Visualization

Abstract

We provide some general results on the convergence of a class of stochastic approximation algorithms and their parallel and asynchronous variants. We then use these results to study the Q-learning algorithm, a reinforcement learning method for solving Markov decision problems, and establish its convergence under conditions more general than previously available.