The many faces of optimism: a unifying approach

Authors:
István Szita;András Lőrincz
Affiliations:
Eötvös Loránd University, Budapest, Hungary;Eötvös Loránd University, Budapest, Hungary
Venue:
Proceedings of the 25th international conference on Machine learning
Year:
2008

Citing 11
Cited 8

Efficient model-based exploration

Proceedings of the fifth international conference on simulation of adaptive behavior on From animals to animats 5
Exploration of Multi-State Environments: Local Measures and Back-Propagation of Uncertainty

Machine Learning
Convergence Results for Single-Step On-PolicyReinforcement-Learning Algorithms

Machine Learning
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Near-Optimal Reinforcement Learning in Polynomial Time

Machine Learning
Near-Optimal Reinforcement Learning in Polynominal Time

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
A Bayesian Framework for Reinforcement Learning

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Learning and planning in structured worlds

Learning and planning in structured worlds
A theoretical analysis of Model-Based Interval Estimation

ICML '05 Proceedings of the 22nd international conference on Machine learning
An analysis of model-based Interval Estimation for Markov Decision Processes

Journal of Computer and System Sciences
R-MAX: a general polynomial time algorithm for near-optimal reinforcement learning

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2

Optimistic initialization and greediness lead to polynomial time learning in factored MDPs

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Considering Unseen States as Impossible in Factored Reinforcement Learning

ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part I
Exploring compact reinforcement-learning representations with linear regression

UAI '09 Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence
Improving optimistic exploration in model-free reinforcement learning

ICANNGA'09 Proceedings of the 9th international conference on Adaptive and natural computing algorithms
TeXDYNA: hierarchical reinforcement learning in factored MDPs

SAB'10 Proceedings of the 11th international conference on Simulation of adaptive behavior: from animals to animats
AGI architecture measures human parameters and optimizes human performance

AGI'11 Proceedings of the 4th international conference on Artificial general intelligence
Exploration strategies for learning in multi-agent foraging

SEMCCO'11 Proceedings of the Second international conference on Swarm, Evolutionary, and Memetic Computing - Volume Part II
V-MAX: tempered optimism for better PAC reinforcement learning

Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 1

Quantified Score

Hi-index	0.00

Visualization

Abstract

The exploration-exploitation dilemma has been an intriguing and unsolved problem within the framework of reinforcement learning. "Optimism in the face of uncertainty" and model building play central roles in advanced exploration methods. Here, we integrate several concepts and obtain a fast and simple algorithm. We show that the proposed algorithm finds a near-optimal policy in polynomial time, and give experimental evidence that it is robust and efficient compared to its ascendants.