Reinforcement learning using a grid based function approximator

Authors:
Alexander Sung;Artur Merke;Martin Riedmiller
Affiliations:
Institute of Computer Science, University of Tübingen, Tübingen, Germany;Institute of Computer Science, University of Dortmund, Dortmund, Germany;Institute of Computer Science, University of Osnabrück, Osnabrück, Germany
Venue:
Biomimetic Neural Learning for Intelligent Robots
Year:
2005

Citing 8
Cited 0

Technical Note: \cal Q-Learning

Machine Learning
Improving Generalization with Active Learning

Machine Learning - Special issue on structured connectionist systems
Active learning in neural networks

New learning paradigms in soft computing
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
A Necessary Condition of Convergence for Reinforcement Learning with Function Approximation

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
On Active Learning for Data Acquisition

ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
Active learning with statistical models

Journal of Artificial Intelligence Research
An analytical framework for local feedforward networks

IEEE Transactions on Neural Networks

Quantified Score

Hi-index	0.00

Visualization

Abstract

Function approximators are commonly in use for reinforcement learning systems to cover the untrained situations, when dealing with large problems. By doing so however, theoretical proves on the convergence criteria are still lacking, and practical researches have both positive and negative results. In a recent work [3] with neural networks, the authors reported that the final results did not reach the quality of a Q-table in which no approximation ability was used. In this paper, we continue this research with grid based function approximators. In addition, we consider the required number of state transitions and apply ideas from the field of active learning to reduce this number. We expect the learning process of a similar problem in a real world system to be significantly shorter because state transitions, which represent an object's actual movements, require much more time than basic computational processes.