Reward function and initial values: better choices for accelerated goal-directed reinforcement learning

Authors:
Laëtitia Matignon;Guillaume J. Laurent;Nadine Le Fort-Piat
Affiliations:
Laboratoire d’Automatique de Besançon UMR CNRS 6596, Besançon, France;Laboratoire d’Automatique de Besançon UMR CNRS 6596, Besançon, France;Laboratoire d’Automatique de Besançon UMR CNRS 6596, Besançon, France
Venue:
ICANN'06 Proceedings of the 16th international conference on Artificial Neural Networks - Volume Part I
Year:
2006

Citing 6
Cited 0

Technical Note: \cal Q-Learning

Machine Learning
The effect of representation and knowledge on goal-directed exploration with reinforcement-learning algorithms

Machine Learning - Special issue on reinforcement learning
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Reinforcement Learning in Continuous Time and Space

Neural Computation
Potential-based shaping and Q-value initialization are equivalent

Journal of Artificial Intelligence Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

An important issue in Reinforcement Learning (RL) is to accelerate or improve the learning process. In this paper, we study the influence of some RL parameters over the learning speed. Indeed, although RL convergence properties have been widely studied, no precise rules exist to correctly choose the reward function and initial Q-values. Our method helps the choice of these RL parameters within the context of reaching a goal in a minimal time. We develop a theoretical study and also provide experimental justifications for choosing on the one hand the reward function, and on the other hand particular initial Q-values based on a goal bias function.