Guiding exploration by pre-existing knowledge without modifying reward

Authors:
Kary Främling
Affiliations:
Helsinki University of Technology, P.O. Box 5500, FIN-02015 HUT, Finland
Venue:
Neural Networks
Year:
2007

Citing 17
Cited 2

Integrated architecture for learning, planning, and reacting based on approximating dynamic programming

Proceedings of the seventh international conference (1990) on Machine learning
Integrated modeling and control based on reinforcement learning and dynamic programming

NIPS-3 Proceedings of the 1990 conference on Advances in neural information processing systems 3
Reinforcement learning and its application to control

Reinforcement learning and its application to control
Prioritized Sweeping: Reinforcement Learning with Less Data and Less Time

Machine Learning
Reinforcement learning with replacing eligibility traces

Machine Learning - Special issue on reinforcement learning
Convergence Results for Single-Step On-PolicyReinforcement-Learning Algorithms

Machine Learning
Learning and Computational Neuroscience: Foundations of Adaptive Networks

Learning and Computational Neuroscience: Foundations of Adaptive Networks
Continuous-Action Q-Learning

Machine Learning
Learning to Predict by the Methods of Temporal Differences

Machine Learning
Reinforcement Learning in POMDPs with Function Approximation

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Near-Optimal Reinforcement Learning in Polynominal Time

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Shaping in Reinforcement Learning by Changing the Physics of the Problem

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Learning to Drive a Bicycle Using Reinforcement Learning and Shaping

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
R-max - a general polynomial time algorithm for near-optimal reinforcement learning

The Journal of Machine Learning Research
Reinforcement Learning in Continuous Time and Space

Neural Computation
Reinforcement learning: a survey

Journal of Artificial Intelligence Research

2008 Special Issue: Finding intrinsic rewards by embodied evolution and constrained reinforcement learning

Neural Networks
Knowledge-Based Exploration for Reinforcement Learning in Self-Organizing Neural Networks

WI-IAT '12 Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 02

Quantified Score

Hi-index	0.01

Visualization

Abstract

Reinforcement learning is based on exploration of the environment and receiving reward that indicates which actions taken by the agent are good and which ones are bad. In many applications receiving even the first reward may require long exploration, during which the agent has no information about its progress. This paper presents an approach that makes it possible to use pre-existing knowledge about the task for guiding exploration through the state space. Concepts of short- and long-term memory combine guidance by pre-existing knowledge with reinforcement learning methods for value function estimation in order to make learning faster while allowing the agent to converge towards a good policy.