Interpolation-based Q-learning

Authors:
Csaba Szepesvári;William D. Smart
Affiliations:
Computer and Automation Research Institute of the Hungarian Academy of Sciences, Budapest XI, Hungary.;Washington University in St. Louis, St. Louis, MO
Venue:
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Year:
2004

Citing 6
Cited 14

Feature-based methods for large scale dynamic programming

Machine Learning - Special issue on reinforcement learning
Reinforcement learning with replacing eligibility traces

Machine Learning - Special issue on reinforcement learning
A unified analysis of value-function-based reinforcement learning algorithms

Neural Computation
Kernel-Based Reinforcement Learning

Machine Learning
Variable Resolution Discretization for High-Accuracy Solutions of Optimal Control Problems

IJCAI '99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence
Learning Rates for Q-learning

The Journal of Machine Learning Research

Finite time bounds for sampling based fitted value iteration

ICML '05 Proceedings of the 22nd international conference on Machine learning
Model-based function approximation in reinforcement learning

Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path

Machine Learning
An analysis of reinforcement learning with function approximation

Proceedings of the 25th international conference on Machine learning
Finite-Time Bounds for Fitted Value Iteration

The Journal of Machine Learning Research
Emerging coordination in infinite team Markov games

Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
A New Learning Algorithm for Optimal Stopping

Discrete Event Dynamic Systems
Q-learning with linear function approximation

COLT'07 Proceedings of the 20th annual conference on Learning theory
Approximate dynamic programming with a fuzzy parameterization

Automatica (Journal of IFAC)
Coordinated learning in multiagent MDPs with infinite state-space

Autonomous Agents and Multi-Agent Systems
Continuous-state reinforcement learning with fuzzy approximation

ALAMAS'05/ALAMAS'06/ALAMAS'07 Proceedings of the 5th , 6th and 7th European conference on Adaptive and learning agents and multi-agent systems: adaptation and multi-agent learning
Distributed self-learning scheduling approach for wireless sensor network

Ad Hoc Networks
Dynamic policy programming

The Journal of Machine Learning Research
Adaptive function approximation in reinforcement learning with an interpolating growing neural gas

International Journal of Hybrid Intelligent Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

We consider a variant of Q-learning in continuous state spaces under the total expected discounted cost criterion combined with local function approximation methods. Provided that the function approximator satisfies certain interpolation properties, the resulting algorithm is shown to converge with probability one. The limit function is shown to satisfy a fixed point equation of the Bellman type, where the fixed point operator depends on the stationary distribution of the exploration policy and the function approximation method. The basic algorithm is extended in several ways. In particular, a variant of the algorithm is obtained that is shown to converge in probability to the optimal Q function. Preliminary computer simulations are presented that confirm the validity of the approach.