A theoretical analysis of Model-Based Interval Estimation
ICML '05 Proceedings of the 22nd international conference on Machine learning
An analysis of model-based Interval Estimation for Markov Decision Processes
Journal of Computer and System Sciences
Optimism in the Face of Uncertainty Should be Refutable
Minds and Machines
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Efficient Uncertainty Propagation for Reinforcement Learning with Limited Data
ICANN '09 Proceedings of the 19th International Conference on Artificial Neural Networks: Part I
V-MAX: tempered optimism for better PAC reinforcement learning
Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Hi-index | 0.00 |
This paper takes an empirical approach to evaluating three model-based reinforcement-learning methods. All methods intend to speed the learning process by mixing exploitation of learned knowledge with exploration of possibly promising alternatives. We consider 驴-greedy exploration, which is computationally cheap and popular, but unfocused in its exploration effort; R-Max exploration, a simplification of an exploration scheme that comes with a theoretical guarantee of efficiency; and a well-grounded approach, model-based interval estimation, that better integrates exploration and exploitation. Our experiments indicate that effective exploration can result in dramatic improvements in the observed rate of learning.