Optimistic AIXI

Authors:
Peter Sunehag;Marcus Hutter
Affiliations:
Research School of Computer Science, Australian National University, Canberra, Australia;Research School of Computer Science, Australian National University, Canberra, Australia
Venue:
AGI'12 Proceedings of the 5th international conference on Artificial General Intelligence
Year:
2012

Citing 10
Cited 0

An introduction to Kolmogorov complexity and its applications

An introduction to Kolmogorov complexity and its applications
Universal Artificial Intelligence: Sequential Decisions Based On Algorithmic Probability

Universal Artificial Intelligence: Sequential Decisions Based On Algorithmic Probability
Universal Intelligence: A Definition of Machine Intelligence

Minds and Machines
Stationary algorithmic probability

Theoretical Computer Science
Artificial Intelligence: A Modern Approach

Artificial Intelligence: A Modern Approach
Reinforcement Learning in Finite MDPs: PAC Analysis

The Journal of Machine Learning Research
Optimality issues of universal greedy agents with static priors

ALT'10 Proceedings of the 21st international conference on Algorithmic learning theory
Axioms for rational reinforcement learning

ALT'11 Proceedings of the 22nd international conference on Algorithmic learning theory
Asymptotically optimal agents

ALT'11 Proceedings of the 22nd international conference on Algorithmic learning theory
Optimistic agents are asymptotically optimal

AI'12 Proceedings of the 25th Australasian joint conference on Advances in Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

We consider extending the AIXI agent by using multiple (or even a compact class of) priors. This has the benefit of weakening the conditions on the true environment that we need to prove asymptotic optimality. Furthermore, it decreases the arbitrariness of picking the prior or reference machine. We connect this to removing symmetry between accepting and rejecting bets in the rationality axiomatization of AIXI and replacing it with optimism. Optimism is often used to encourage exploration in the more restrictive Markov Decision Process setting and it alleviates the problem that AIXI (with geometric discounting) stops exploring prematurely.