A LEARNING ALGORITHM FOR DISCRETE-TIME STOCHASTIC CONTROL

Authors:
V. S. Borkar
Affiliations:
School of Technology and Computer Science, Tata Institute of Fundamental Research, Homi Bhabha Road, Mumbai 400005, India, E-mail: borkar@tifr.res.in
Venue:
Probability in the Engineering and Informational Sciences
Year:
2000

Citing 9
Cited 2

Convergent activation dynamics in continuous time networks

Neural Networks
Adaptive algorithms and stochastic approximations

Adaptive algorithms and stochastic approximations
Technical Note: \cal Q-Learning

Machine Learning
Asynchronous Stochastic Approximation and Q-Learning

Machine Learning
Stochastic approximation with two time scales

Systems & Control Letters
The O.D. E. Method for Convergence of Stochastic Approximation and Reinforcement Learning

SIAM Journal on Control and Optimization
Actor-Critic--Type Learning Algorithms for Markov Decision Processes

SIAM Journal on Control and Optimization
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Neuro-Dynamic Programming

Neuro-Dynamic Programming

An analysis of reinforcement learning with function approximation

Proceedings of the 25th international conference on Machine learning
Q-learning with linear function approximation

COLT'07 Proceedings of the 20th annual conference on Learning theory

Quantified Score

Hi-index	0.00

Visualization

Abstract

A simulation-based algorithm for learning good policies for a discrete-time stochastic control process with unknown transition law is analyzed when the state and action spaces are compact subsets of Euclidean spaces. This extends the Q-learning scheme of discrete state/action problems along the lines of Baker [4]. Almost sure convergence is proved under suitable conditions.