Bayesian sparse sampling for on-line reward optimization

Authors:
Tao Wang;Daniel Lizotte;Michael Bowling;Dale Schuurmans
Affiliations:
University of Alberta, Edmonton, Canada;University of Alberta, Edmonton, Canada;University of Alberta, Edmonton, Canada;University of Alberta, Edmonton, Canada
Venue:
ICML '05 Proceedings of the 22nd international conference on Machine learning
Year:
2005

Citing 18
Cited 27

Associative Reinforcement Learning: Functions in k-DNF

Machine Learning
Learning in graphical models

Learning in graphical models
Prediction with Gaussian processes: from linear regression to linear prediction and beyond

Learning in graphical models
Complexity of finite-horizon Markov decision process problems

Journal of the ACM (JACM)
Bayesian Learning for Neural Networks

Bayesian Learning for Neural Networks
Dynamic Programming and Optimal Control, Two Volume Set

Dynamic Programming and Optimal Control, Two Volume Set
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Neuro-Dynamic Programming

Neuro-Dynamic Programming
Exploration Control in Reinforcement Learning using Optimistic Model Selection

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Near-Optimal Reinforcement Learning in Polynominal Time

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
A Bayesian Framework for Reinforcement Learning

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
PEGASUS: A policy search method for large MDPs and POMDPs

UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
Optimal learning: computational procedures for bayes-adaptive markov decision processes

Optimal learning: computational procedures for bayes-adaptive markov decision processes
Policy search using paired comparisons

The Journal of Machine Learning Research
Nonapproximability results for partially observable Markov decision processes

Journal of Artificial Intelligence Research
A sparse sampling algorithm for near-optimal planning in large Markov decision processes

IJCAI'99 Proceedings of the 16th international joint conference on Artificial intelligence - Volume 2
R-MAX: a general polynomial time algorithm for near-optimal reinforcement learning

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
Model based Bayesian exploration

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence

Reinforcement learning for active model selection

UBDM '05 Proceedings of the 1st international workshop on Utility-based data mining
An analytic solution to discrete Bayesian reinforcement learning

ICML '06 Proceedings of the 23rd international conference on Machine learning
Multi-task reinforcement learning: a hierarchical Bayesian approach

Proceedings of the 24th international conference on Machine learning
Near-Bayesian exploration in polynomial time

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Action selection in Bayesian reinforcement learning

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Monte Carlo sampling methods for approximating interactive POMDPs

Journal of Artificial Intelligence Research
Automatic gait optimization with Gaussian process regression

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Using linear programming for Bayesian exploration in Markov decision processes

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Bayesian reinforcement learning in continuous pomdps with Gaussian processes

IROS'09 Proceedings of the 2009 IEEE/RSJ international conference on Intelligent robots and systems
Simple model-based exploration and exploitation of Markov decision processes using the elimination algorithm

MICAI'07 Proceedings of the artificial intelligence 6th Mexican international conference on Advances in artificial intelligence
A Bayesian sampling approach to exploration in reinforcement learning

UAI '09 Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence
Smarter sampling in model-based Bayesian reinforcement learning

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part I
Solving non-stationary bandit problems by random sampling from sibling Kalman filters

IEA/AIE'10 Proceedings of the 23rd international conference on Industrial engineering and other applications of applied intelligent systems - Volume Part III
A Monte-Carlo AIXI approximation

Journal of Artificial Intelligence Research
A Bayesian Approach for Learning and Planning in Partially Observable Markov Decision Processes

The Journal of Machine Learning Research
Learning form experience: a bayesian network based reinforcement learning approach

ICICA'11 Proceedings of the Second international conference on Information Computing and Applications
Robust bayesian reinforcement learning through tight lower bounds

EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
Optimistic Bayesian sampling in contextual-bandit problems

The Journal of Machine Learning Research
New algorithms for budgeted learning

Machine Learning
TEXPLORE: real-time sample-efficient reinforcement learning for robots

Machine Learning
Hybrid POMDP based evolutionary adaptive framework for efficient visual tracking algorithms

Proceedings of the 15th annual conference on Genetic and evolutionary computation
Testing probabilistic equivalence through Reinforcement Learning

Information and Computation
Accelerated Bayesian learning for decentralized two-armed bandit based decision making with applications to the Goore Game

Applied Intelligence
Prior-free exploration bonus for and beyond near bayes-optimal behavior

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Linear Bayesian reinforcement learning

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Monte-Carlo tree search for Bayesian reinforcement learning

Applied Intelligence
Scalable and efficient bayes-adaptive reinforcement learning based on monte-carlo tree search

Journal of Artificial Intelligence Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present an efficient "sparse sampling" technique for approximating Bayes optimal decision making in reinforcement learning, addressing the well known exploration versus exploitation tradeoff. Our approach combines sparse sampling with Bayesian exploration to achieve improved decision making while controlling computational cost. The idea is to grow a sparse lookahead tree, intelligently, by exploiting information in a Bayesian posterior---rather than enumerate action branches (standard sparse sampling) or compensate myopically (value of perfect information). The outcome is a flexible, practical technique for improving action selection in simple reinforcement learning scenarios.