Communications of the ACM
Learning in embedded systems
An introduction to computational learning theory
An introduction to computational learning theory
Efficient model-based exploration
Proceedings of the fifth international conference on simulation of adaptive behavior on From animals to animats 5
Bounded-parameter Markov decision process
Artificial Intelligence
Markov Decision Processes: Discrete Stochastic Dynamic Programming
Markov Decision Processes: Discrete Stochastic Dynamic Programming
Reinforcement Learning
Near-Optimal Reinforcement Learning in Polynomial Time
Machine Learning
Exploration Control in Reinforcement Learning using Optimistic Model Selection
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Expected Mistake Bound Model for On-Line Reinforcement Learning
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
PAC Bounds for Multi-armed Bandit and Markov Decision Processes
COLT '02 Proceedings of the 15th Annual Conference on Computational Learning Theory
R-max - a general polynomial time algorithm for near-optimal reinforcement learning
The Journal of Machine Learning Research
Using confidence bounds for exploitation-exploration trade-offs
The Journal of Machine Learning Research
An Empirical Evaluation of Interval Estimation for Markov Decision Processes
ICTAI '04 Proceedings of the 16th IEEE International Conference on Tools with Artificial Intelligence
A theoretical analysis of Model-Based Interval Estimation
ICML '05 Proceedings of the 22nd international conference on Machine learning
PAC model-free reinforcement learning
ICML '06 Proceedings of the 23rd international conference on Machine learning
A simple distribution-free approach to the max k-armed bandit problem
CP'06 Proceedings of the 12th international conference on Principles and Practice of Constraint Programming
The many faces of optimism: a unifying approach
Proceedings of the 25th international conference on Machine learning
Near-Bayesian exploration in polynomial time
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Reinforcement Learning in Finite MDPs: PAC Analysis
The Journal of Machine Learning Research
PAC-MDP learning with knowledge-based admissible models
Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Near-optimal Regret Bounds for Reinforcement Learning
The Journal of Machine Learning Research
Uncertainty Propagation for Efficient Exploration in Reinforcement Learning
Proceedings of the 2010 conference on ECAI 2010: 19th European Conference on Artificial Intelligence
The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 3
ALT'11 Proceedings of the 22nd international conference on Algorithmic learning theory
Robust bayesian reinforcement learning through tight lower bounds
EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
V-MAX: tempered optimism for better PAC reinforcement learning
Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
PAC bounds for discounted MDPs
ALT'12 Proceedings of the 23rd international conference on Algorithmic Learning Theory
Prior-free exploration bonus for and beyond near bayes-optimal behavior
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Monte-Carlo tree search for Bayesian reinforcement learning
Applied Intelligence
Hi-index | 0.00 |
Several algorithms for learning near-optimal policies in Markov Decision Processes have been analyzed and proven efficient. Empirical results have suggested that Model-based Interval Estimation (MBIE) learns efficiently in practice, effectively balancing exploration and exploitation. This paper presents a theoretical analysis of MBIE and a new variation called MBIE-EB, proving their efficiency even under worst-case conditions. The paper also introduces a new performance metric, average loss, and relates it to its less ''online'' cousins from the literature.