Learning in embedded systems
AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Reinforcement Learning
Neuro-Dynamic Programming
Bias and variance in value function estimation
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Bayesian actor-critic algorithms
Proceedings of the 24th international conference on Machine learning
Online kernel selection for Bayesian reinforcement learning
Proceedings of the 25th international conference on Machine learning
Geodesic Gaussian kernels for value function approximation
Autonomous Robots
Regularized Fitted Q-Iteration: Application to Planning
Recent Advances in Reinforcement Learning
Gaussian process dynamic programming
Neurocomputing
Kernelized value function approximation for reinforcement learning
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Online exploration in least-squares policy iteration
Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
Feature Selection for Value Function Approximation Using Bayesian Model Selection
ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part I
Efficient Uncertainty Propagation for Reinforcement Learning with Limited Data
ICANN '09 Proceedings of the 19th International Conference on Artificial Neural Networks: Part I
Regularized fitted Q-iteration for planning in continuous-space Markovian decision problems
ACC'09 Proceedings of the 2009 conference on American Control Conference
Adaptive autonomous control using online value iteration with Gaussian processes
ICRA'09 Proceedings of the 2009 IEEE international conference on Robotics and Automation
Model-based and model-free reinforcement learning for visual servoing
ICRA'09 Proceedings of the 2009 IEEE international conference on Robotics and Automation
Incorporating domain models into Bayesian optimization for RL
ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part III
Reducing reinforcement learning to KWIK online regression
Annals of Mathematics and Artificial Intelligence
Gaussian processes for fast policy optimisation of POMDP-based dialogue managers
SIGDIAL '10 Proceedings of the 11th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Solving non-stationary bandit problems by random sampling from sibling Kalman filters
IEA/AIE'10 Proceedings of the 23rd international conference on Industrial engineering and other applications of applied intelligent systems - Volume Part III
Journal of Artificial Intelligence Research
Hessian matrix distribution for Bayesian policy gradient reinforcement learning
Information Sciences: an International Journal
ACM Transactions on Speech and Language Processing (TSLP)
A Bayesian Approach for Learning and Planning in Partially Observable Markov Decision Processes
The Journal of Machine Learning Research
Improving Gaussian process value function approximation in policy gradient algorithms
ICANN'11 Proceedings of the 21st international conference on Artificial neural networks - Volume Part II
Sparse Kernel-SARSA(λ) with an eligibility trace
ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part III
Models for autonomously motivated exploration in reinforcement learning
ALT'11 Proceedings of the 22nd international conference on Algorithmic learning theory
A competitive strategy for function approximation in Q-learning
IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Two
Gradient based algorithms with loss functions and kernels for improved on-policy control
EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
Value function approximation through sparse bayesian modeling
EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
An online kernel-based clustering approach for value function approximation
SETN'12 Proceedings of the 7th Hellenic conference on Artificial Intelligence: theories and applications
Online learning with multiple kernels: A review
Neural Computation
An efficient L2-norm regularized least-squares temporal difference learning algorithm
Knowledge-Based Systems
Linear Bayesian reinforcement learning
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Monte-Carlo tree search for Bayesian reinforcement learning
Applied Intelligence
Gaussian Processes for POMDP-Based Dialogue Manager Optimization
IEEE/ACM Transactions on Audio, Speech and Language Processing (TASLP)
Hi-index | 0.00 |
Gaussian Process Temporal Difference (GPTD) learning offers a Bayesian solution to the policy evaluation problem of reinforcement learning. In this paper we extend the GPTD framework by addressing two pressing issues, which were not adequately treated in the original GPTD paper (Engel et al., 2003). The first is the issue of stochasticity in the state transitions, and the second is concerned with action selection and policy improvement. We present a new generative model for the value function, deduced from its relation with the discounted return. We derive a corresponding on-line algorithm for learning the posterior moments of the value Gaussian process. We also present a SARSA based extension of GPTD, termed GPSARSA, that allows the selection of actions and the gradual improvement of policies without requiring a world-model.