Proceedings of the seventh international conference (1990) on Machine learning
Integrated modeling and control based on reinforcement learning and dynamic programming
NIPS-3 Proceedings of the 1990 conference on Advances in neural information processing systems 3
Reinforcement learning and its application to control
Reinforcement learning and its application to control
Reinforcement learning with replacing eligibility traces
Machine Learning - Special issue on reinforcement learning
Learning and Computational Neuroscience: Foundations of Adaptive Networks
Learning and Computational Neuroscience: Foundations of Adaptive Networks
Machine Learning
Learning to Predict by the Methods of Temporal Differences
Machine Learning
Reinforcement Learning in POMDPs with Function Approximation
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
Near-Optimal Reinforcement Learning in Polynominal Time
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Policy Invariance Under Reward Transformations: Theory and Application to Reward Shaping
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Shaping in Reinforcement Learning by Changing the Physics of the Problem
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Learning to Drive a Bicycle Using Reinforcement Learning and Shaping
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
R-max - a general polynomial time algorithm for near-optimal reinforcement learning
The Journal of Machine Learning Research
Reinforcement Learning in Continuous Time and Space
Neural Computation
Reinforcement learning: a survey
Journal of Artificial Intelligence Research
Knowledge-Based Exploration for Reinforcement Learning in Self-Organizing Neural Networks
WI-IAT '12 Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 02
Hi-index | 0.01 |
Reinforcement learning is based on exploration of the environment and receiving reward that indicates which actions taken by the agent are good and which ones are bad. In many applications receiving even the first reward may require long exploration, during which the agent has no information about its progress. This paper presents an approach that makes it possible to use pre-existing knowledge about the task for guiding exploration through the state space. Concepts of short- and long-term memory combine guidance by pre-existing knowledge with reinforcement learning methods for value function estimation in order to make learning faster while allowing the agent to converge towards a good policy.