Linear Bayesian reinforcement learning

Authors:
Nikolaos Tziortziotis;Christos Dimitrakakis;Konstantinos Blekas
Affiliations:
University of Ioannina;EPFL;University of Ioannina
Venue:
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Year:
2013

Citing 18
Cited 0

Linear least-squares algorithms for temporal difference learning

Machine Learning - Special issue on reinforcement learning
A Bayesian Framework for Reinforcement Learning

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Optimal learning: computational procedures for bayes-adaptive markov decision processes

Optimal learning: computational procedures for bayes-adaptive markov decision processes
Least-squares policy iteration

The Journal of Machine Learning Research
Tree-Based Batch Mode Reinforcement Learning

The Journal of Machine Learning Research
Reinforcement learning with Gaussian processes

ICML '05 Proceedings of the 22nd international conference on Machine learning
Bayesian sparse sampling for on-line reward optimization

ICML '05 Proceedings of the 22nd international conference on Machine learning
An analytic solution to discrete Bayesian reinforcement learning

ICML '06 Proceedings of the 23rd international conference on Machine learning
Online kernel selection for Bayesian reinforcement learning

Proceedings of the 25th international conference on Machine learning
Gaussian process dynamic programming

Neurocomputing
Near-Bayesian exploration in polynomial time

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Tree Exploration for Bayesian RL Exploration

CIMCA '08 Proceedings of the 2008 International Conference on Computational Intelligence for Modelling Control & Automation
A Bayesian sampling approach to exploration in reinforcement learning

UAI '09 Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence
Near-optimal Regret Bounds for Reinforcement Learning

The Journal of Machine Learning Research
Smarter sampling in model-based Bayesian reinforcement learning

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part I
Gaussian processes for sample efficient reinforcement learning with RMAX-like exploration

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part I
Robust bayesian reinforcement learning through tight lower bounds

EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
Thompson sampling: an asymptotically optimal finite-time analysis

ALT'12 Proceedings of the 23rd international conference on Algorithmic Learning Theory

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper proposes a simple linear Bayesian approach to reinforcement learning. We show that with an appropriate basis, a Bayesian linear Gaussian model is sufficient for accurately estimating the system dynamics, and in particular when we allow for correlated noise. Policies are estimated by first sampling a transition model from the current posterior, and then performing approximate dynamic programming on the sampled model. This form of approximate Thompson sampling results in good exploration in unknown environments. The approach can also be seen as a Bayesian generalisation of least-squares policy iteration, where the empirical transition matrix is replaced with a sample from the posterior.