Optimistic agents are asymptotically optimal

Authors:
Peter Sunehag;Marcus Hutter
Affiliations:
Research School of Computer Science, Australian National University, Canberra, Australia;Research School of Computer Science, Australian National University, Canberra, Australia
Venue:
AI'12 Proceedings of the 25th Australasian joint conference on Advances in Artificial Intelligence
Year:
2012

Citing 10
Cited 2

Near-Optimal Reinforcement Learning in Polynominal Time

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Universal Artificial Intelligence: Sequential Decisions Based On Algorithmic Probability

Universal Artificial Intelligence: Sequential Decisions Based On Algorithmic Probability
A theoretical analysis of Model-Based Interval Estimation

ICML '05 Proceedings of the 22nd international conference on Machine learning
On the possibility of learning in reactive environments with arbitrary dependence

Theoretical Computer Science
Reinforcement learning in POMDPs without resets

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Artificial Intelligence: A Modern Approach

Artificial Intelligence: A Modern Approach
Optimality issues of universal greedy agents with static priors

ALT'10 Proceedings of the 21st international conference on Algorithmic learning theory
Asymptotically optimal agents

ALT'11 Proceedings of the 22nd international conference on Algorithmic learning theory
Time consistent discounting

ALT'11 Proceedings of the 22nd international conference on Algorithmic learning theory
PAC bounds for discounted MDPs

ALT'12 Proceedings of the 23rd international conference on Algorithmic Learning Theory

Optimistic AIXI

AGI'12 Proceedings of the 5th international conference on Artificial General Intelligence
Learning agents with evolving hypothesis classes

AGI'13 Proceedings of the 6th international conference on Artificial General Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

We use optimism to introduce generic asymptotically optimal reinforcement learning agents. They achieve, with an arbitrary finite or compact class of environments, asymptotically optimal behavior. Furthermore, in the finite deterministic case we provide finite error bounds.