Efficient model-based exploration
Proceedings of the fifth international conference on simulation of adaptive behavior on From animals to animats 5
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Near-Optimal Reinforcement Learning in Polynomial Time
Machine Learning
Near-Optimal Reinforcement Learning in Polynominal Time
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
A Bayesian Framework for Reinforcement Learning
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Learning and planning in structured worlds
Learning and planning in structured worlds
A theoretical analysis of Model-Based Interval Estimation
ICML '05 Proceedings of the 22nd international conference on Machine learning
An analysis of model-based Interval Estimation for Markov Decision Processes
Journal of Computer and System Sciences
R-MAX: a general polynomial time algorithm for near-optimal reinforcement learning
IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
Optimistic initialization and greediness lead to polynomial time learning in factored MDPs
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Considering Unseen States as Impossible in Factored Reinforcement Learning
ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part I
Exploring compact reinforcement-learning representations with linear regression
UAI '09 Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence
Improving optimistic exploration in model-free reinforcement learning
ICANNGA'09 Proceedings of the 9th international conference on Adaptive and natural computing algorithms
TeXDYNA: hierarchical reinforcement learning in factored MDPs
SAB'10 Proceedings of the 11th international conference on Simulation of adaptive behavior: from animals to animats
AGI architecture measures human parameters and optimizes human performance
AGI'11 Proceedings of the 4th international conference on Artificial general intelligence
Exploration strategies for learning in multi-agent foraging
SEMCCO'11 Proceedings of the Second international conference on Swarm, Evolutionary, and Memetic Computing - Volume Part II
V-MAX: tempered optimism for better PAC reinforcement learning
Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Hi-index | 0.00 |
The exploration-exploitation dilemma has been an intriguing and unsolved problem within the framework of reinforcement learning. "Optimism in the face of uncertainty" and model building play central roles in advanced exploration methods. Here, we integrate several concepts and obtain a fast and simple algorithm. We show that the proposed algorithm finds a near-optimal policy in polynomial time, and give experimental evidence that it is robust and efficient compared to its ascendants.