Bayesian reinforcement learning in continuous pomdps with Gaussian processes

Authors:
Patrick Dallaire;Camille Besse;Stephane Ross;Brahim Chaib-draa
Affiliations:
partment of Computer Science, Laval University, Quebec, Canada;partment of Computer Science, Laval University, Quebec, Canada;Robotics Institute, Carnegie Mellon University, Pittsburgh, PA;partment of Computer Science, Laval University, Quebec, Canada
Venue:
IROS'09 Proceedings of the 2009 IEEE/RSJ international conference on Intelligent robots and systems
Year:
2009

Citing 11
Cited 3

Bayesian Classification With Gaussian Processes

IEEE Transactions on Pattern Analysis and Machine Intelligence
Bayesian Learning for Neural Networks

Bayesian Learning for Neural Networks
Optimal learning: computational procedures for bayes-adaptive markov decision processes

Optimal learning: computational procedures for bayes-adaptive markov decision processes
Information Theory, Inference & Learning Algorithms

Information Theory, Inference & Learning Algorithms
Bayesian sparse sampling for on-line reward optimization

ICML '05 Proceedings of the 22nd international conference on Machine learning
An analytic solution to discrete Bayesian reinforcement learning

ICML '06 Proceedings of the 23rd international conference on Machine learning
Point-Based Value Iteration for Continuous POMDPs

The Journal of Machine Learning Research
Gaussian Process Dynamical Models for Human Motion

IEEE Transactions on Pattern Analysis and Machine Intelligence
Reinforcement learning with limited reinforcement: using Bayes risk for active learning in POMDPs

Proceedings of the 25th international conference on Machine learning
Perseus: randomized point-based value iteration for POMDPs

Journal of Artificial Intelligence Research
Using linear programming for Bayesian exploration in Markov decision processes

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence

An approximate inference with Gaussian process to latent functions from uncertain data

Neurocomputing
Adaptive collective routing using gaussian process dynamic congestion models

Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Planning for multiple measurement channels in a continuous-state POMDP

Annals of Mathematics and Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Partially Observable Markov Decision Processes (POMDPs) provide a rich mathematical model to handle real-world sequential decision processes but require a known model to be solved by most approaches. However, mainstream POMDP research focuses on the discrete case and this complicates its application to most realistic problems that are naturally modeled using continuous state spaces. In this paper, we consider the problem of optimal control in continuous and partially observable environments when the parameters of the model are unknown. We advocate the use of Gaussian Process Dynamical Models (GPDMs) so that we can learn the model through experience with the environment. Our results on the blimp problem show that the approach can learn good models of the sensors and actuators in order to maximize long-term rewards.