Proceedings of the seventh international conference (1990) on Machine learning
A possibility for implementing curiosity and boredom in model-building neural controllers
Proceedings of the first international conference on simulation of adaptive behavior on From animals to animats
Efficient learning and planning within the Dyna framework
Adaptive Behavior
Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning
Artificial Intelligence
Automatic Discovery of Subgoals in Reinforcement Learning using Diverse Density
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Discovering Hierarchy in Reinforcement Learning with HEXQ
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Near-Optimal Reinforcement Learning in Polynominal Time
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Efficient Exploration In Reinforcement Learning
Efficient Exploration In Reinforcement Learning
Optimal learning: computational procedures for bayes-adaptive markov decision processes
Optimal learning: computational procedures for bayes-adaptive markov decision processes
Using relative novelty to identify useful temporal abstractions in reinforcement learning
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Dynamic abstraction in reinforcement learning via clustering
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Identifying useful subgoals in reinforcement learning by local graph partitioning
ICML '05 Proceedings of the 22nd international conference on Machine learning
Modeling motivation for adaptive nonplayer characters in dynamic computer game worlds
Computers in Entertainment (CIE) - Theoretical and Practical Computer Applications in Entertainment
Efficient Continuous-Time Reinforcement Learning with Adaptive State Graphs
ECML '07 Proceedings of the 18th European conference on Machine Learning
Utility based Q-learning to facilitate cooperation in Prisoner's Dilemma games
Web Intelligence and Agent Systems
Transfer Learning for Reinforcement Learning Domains: A Survey
The Journal of Machine Learning Research
Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Review: learning like a baby: A survey of artificial intelligence approaches
The Knowledge Engineering Review
EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
Active learning of inverse models with intrinsically motivated goal exploration in robots
Robotics and Autonomous Systems
A sampled fictitious play based learning algorithm for infinite horizon Markov decision processes
Proceedings of the Winter Simulation Conference
Hi-index | 0.00 |
How should a reinforcement learning agent act if its sole purpose is to efficiently learn an optimal policy for later use? In other words, how should it explore, to be able to exploit later? We formulate this problem as a Markov Decision Process by explicitly modeling the internal state of the agent and propose a principled heuristic for its solution. We present experimental results in a number of domains, also exploring the algorithm's use for learning a policy for a skill given its reward function---an important but neglected component of skill discovery.