Learning in embedded systems
Efficient model-based exploration
Proceedings of the fifth international conference on simulation of adaptive behavior on From animals to animats 5
Markov Decision Processes: Discrete Stochastic Dynamic Programming
Markov Decision Processes: Discrete Stochastic Dynamic Programming
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Near-Optimal Reinforcement Learning in Polynomial Time
Machine Learning
Expected Mistake Bound Model for On-Line Reinforcement Learning
ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
R-max - a general polynomial time algorithm for near-optimal reinforcement learning
The Journal of Machine Learning Research
An Empirical Evaluation of Interval Estimation for Markov Decision Processes
ICTAI '04 Proceedings of the 16th IEEE International Conference on Tools with Artificial Intelligence
PAC model-free reinforcement learning
ICML '06 Proceedings of the 23rd international conference on Machine learning
Proceedings of the 24th international conference on Machine learning
Proceedings of the 25th international conference on Machine learning
The many faces of optimism: a unifying approach
Proceedings of the 25th international conference on Machine learning
Expediting RL by using graphical structures
Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 3
Data Mining and Knowledge Discovery
An analysis of model-based Interval Estimation for Markov Decision Processes
Journal of Computer and System Sciences
Optimism in the Face of Uncertainty Should be Refutable
Minds and Machines
Efficient Reinforcement Learning in Parameterized Models: Discrete Parameter Case
Recent Advances in Reinforcement Learning
Optimistic initialization and greediness lead to polynomial time learning in factored MDPs
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Using linear programming for Bayesian exploration in Markov decision processes
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Reinforcement Learning in Finite MDPs: PAC Analysis
The Journal of Machine Learning Research
Bounded parameter Markov decision processes with average reward criterion
COLT'07 Proceedings of the 20th annual conference on Learning theory
REGAL: a regularization based algorithm for reinforcement learning in weakly communicating MDPs
UAI '09 Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence
Autonomous Agents and Multi-Agent Systems
Near-optimal Regret Bounds for Reinforcement Learning
The Journal of Machine Learning Research
Reducing reinforcement learning to KWIK online regression
Annals of Mathematics and Artificial Intelligence
A Bayesian Approach for Learning and Planning in Partially Observable Markov Decision Processes
The Journal of Machine Learning Research
Exploiting Best-Match Equations for Efficient Reinforcement Learning
The Journal of Machine Learning Research
V-MAX: tempered optimism for better PAC reinforcement learning
Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
PAC bounds for discounted MDPs
ALT'12 Proceedings of the 23rd international conference on Algorithmic Learning Theory
Optimistic agents are asymptotically optimal
AI'12 Proceedings of the 25th Australasian joint conference on Advances in Artificial Intelligence
Hi-index | 0.00 |
Several algorithms for learning near-optimal policies in Markov Decision Processes have been analyzed and proven efficient. Empirical results have suggested that Model-based Interval Estimation (MBIE) learns efficiently in practice, effectively balancing exploration and exploitation. This paper presents the first theoretical analysis of MBIE, proving its efficiency even under worst-case conditions. The paper also introduces a new performance metric, average loss, and relates it to its less "online" cousins from the literature.