Practical Issues in Temporal Difference Learning
Machine Learning
The Convergence of TD(λ) for General λ
Machine Learning
TD-Gammon, a self-teaching backgammon program, achieves master-level play
Neural Computation
TD(λ) Converges with Probability 1
Machine Learning
Asynchronous Stochastic Approximation and Q-Learning
Machine Learning
Learning to Predict by the Methods of Temporal Differences
Machine Learning
On the convergence of stochastic iterative dynamic programming algorithms
Neural Computation
Reinforcement learning: a survey
Journal of Artificial Intelligence Research
State Space Partition for Reinforcement Learning Based on Fuzzy Min-Max Neural Network
ISNN '07 Proceedings of the 4th international symposium on Neural Networks: Part II--Advances in Neural Networks
A reinforcement learning model for supply chain ordering management: An application to the beer game
Decision Support Systems
BRA: An Algorithm for Simulating Bounded Rational Agents
Computational Economics
Reinforcement learning in mirrorbot
ICANN'05 Proceedings of the 15th international conference on Artificial Neural Networks: biological Inspirations - Volume Part I
Adaptive function approximation in reinforcement learning with an interpolating growing neural gas
International Journal of Hybrid Intelligent Systems
Hi-index | 0.00 |
The convergence property of reinforcement learning has been extensively investigated in the field of machine learning, however, its applications to real-world problems are still constrained due to its computational complexity. A novel algorithm to improve the applicability and efficacy of reinforcement learning algorithms via adaptive state space partitioning is presented. The proposed temporal difference learning with adaptive vector quantization (TD-AVQ) is an online algorithm and does not assume any a priori knowledge with respect to the learning task and environment. It utilizes the information generated from the reinforcement learning algorithms. Therefore, no additional computations on the decisions of how to partition a particular state space are required. A series of simulations are provided to demonstrate the practical values and performance of the proposed algorithms in solving robot motion planning problems.