A LEARNING ALGORITHM FOR DISCRETE-TIME STOCHASTIC CONTROL

  • Authors:
  • V. S. Borkar

  • Affiliations:
  • School of Technology and Computer Science, Tata Institute of Fundamental Research, Homi Bhabha Road, Mumbai 400005, India, E-mail: borkar@tifr.res.in

  • Venue:
  • Probability in the Engineering and Informational Sciences
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

A simulation-based algorithm for learning good policies for a discrete-time stochastic control process with unknown transition law is analyzed when the state and action spaces are compact subsets of Euclidean spaces. This extends the Q-learning scheme of discrete state/action problems along the lines of Baker [4]. Almost sure convergence is proved under suitable conditions.