Kernel-Based Reinforcement Learning

Authors:
Dirk Ormoneit;Śaunak Sen
Affiliations:
Department of Computer Science, Stanford University, Stanford, CA 94305-9010, USA. ormoneit@cs.stanford.edu;The Jackson Laboratory, Bar Harbor, ME 04609, USA.
Venue:
Machine Learning
Year:
2002

Citing 13
Cited 39

Technical Note: \cal Q-Learning

Machine Learning
Efficient learning and planning within the Dyna framework

Proceedings of the second international conference on From animals to animats 2 : simulation of adaptive behavior: simulation of adaptive behavior
Feature-based methods for large scale dynamic programming

Machine Learning - Special issue on reinforcement learning
Locally Weighted Learning for Control

Artificial Intelligence Review - Special issue on lazy learning
Dynamic Programming and Optimal Control, Two Volume Set

Dynamic Programming and Optimal Control, Two Volume Set
Markov Decision Processes: Discrete Stochastic Dynamic Programming

Markov Decision Processes: Discrete Stochastic Dynamic Programming
Kernel-Based Reinforcement Learning

Machine Learning
Learning to Predict by the Methods of Temporal Differences

Machine Learning
Practical Reinforcement Learning in Continuous Spaces

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Reinforcement Learning Applied to Linear Quadratic Regulation

Advances in Neural Information Processing Systems 5, [NIPS Conference]
Convergence of Reinforcement Learning with General Function Approximators

IJCAI '99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence
Dynamic Programming

Dynamic Programming
Approximate solutions to markov decision processes

Approximate solutions to markov decision processes

Kernel-Based Reinforcement Learning

Machine Learning
Least-squares policy iteration

The Journal of Machine Learning Research
Interpolation-based Q-learning

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Standard and averaging reinforcement learning in XCS

Proceedings of the 8th annual conference on Genetic and evolutionary computation
Graph kernels and Gaussian processes for relational reinforcement learning

Machine Learning
Kernel rewards regression: an information efficient batch policy iteration approach

AIA'06 Proceedings of the 24th IASTED international conference on Artificial intelligence and applications
Action Elimination and Stopping Conditions for the Multi-Armed Bandit and Reinforcement Learning Problems

The Journal of Machine Learning Research
Model-based function approximation in reinforcement learning

Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
Restricted gradient-descent algorithm for value-function approximation in reinforcement learning

Artificial Intelligence
Learning near-optimal policies with Bellman-residual minimization based fitted policy iteration and a single sample path

Machine Learning
An analysis of reinforcement learning with function approximation

Proceedings of the 25th international conference on Machine learning
Finite-Time Bounds for Fitted Value Iteration

The Journal of Machine Learning Research
Autonomous transfer for reinforcement learning

Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
Efficient Continuous-Time Reinforcement Learning with Adaptive State Graphs

ECML '07 Proceedings of the 18th European conference on Machine Learning
Fitted Natural Actor-Critic: A New Algorithm for Continuous State-Action MDPs

ECML PKDD '08 Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases - Part II
Regularized Fitted Q-Iteration: Application to Planning

Recent Advances in Reinforcement Learning
Learning and planning in environments with delayed feedback

Autonomous Agents and Multi-Agent Systems
Gaussian process dynamic programming

Neurocomputing
Reinforcement Learning: A Tutorial Survey and Recent Advances

INFORMS Journal on Computing
A Simulation-based Approach for Solving Generalized Semi-Markov Decision Processes

Proceedings of the 2008 conference on ECAI 2008: 18th European Conference on Artificial Intelligence
Learning Representation and Control in Markov Decision Processes: New Frontiers

Foundations and Trends® in Machine Learning
An instance-based state representation for network repair

AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Compositional Models for Reinforcement Learning

ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part I
Adaptive treatment of epilepsy via batch-mode reinforcement learning

IAAI'08 Proceedings of the 20th national conference on Innovative applications of artificial intelligence - Volume 3
Reinforcement learning versus model predictive control: a comparison on a power system problem

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Transfer Learning for Reinforcement Learning Domains: A Survey

The Journal of Machine Learning Research
Coordinated learning in multiagent MDPs with infinite state-space

Autonomous Agents and Multi-Agent Systems
Continuous-state reinforcement learning with fuzzy approximation

ALAMAS'05/ALAMAS'06/ALAMAS'07 Proceedings of the 5th , 6th and 7th European conference on Adaptive and learning agents and multi-agent systems: adaptation and multi-agent learning
Sparse Kernel-SARSA(λ) with an eligibility trace

ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part III
A sparse kernel-based least-squares temporal difference algorithm for reinforcement learning

ICNC'06 Proceedings of the Second international conference on Advances in Natural Computation - Volume Part I
Continuous character control with low-dimensional embeddings

ACM Transactions on Graphics (TOG) - SIGGRAPH 2012 Conference Proceedings
A novel feature sparsification method for kernel-based approximate policy iteration

ISNN'12 Proceedings of the 9th international conference on Advances in Neural Networks - Volume Part I
SMART: A Stochastic Multiscale Model for the Analysis of Energy Resources, Technology, and Policy

INFORMS Journal on Computing
Learn to swing up and balance a real pole based on raw visual input data

ICONIP'12 Proceedings of the 19th international conference on Neural Information Processing - Volume Part V
Employing batch reinforcement learning to control gene regulation without explicitly constructing gene regulatory networks

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Low-discrepancy sampling for approximate dynamic programming with local approximators

Computers and Operations Research
Reinforcement learning algorithms with function approximation: Recent advances and applications

Information Sciences: an International Journal
Policy oscillation is overshooting

Neural Networks
Adaptive function approximation in reinforcement learning with an interpolating growing neural gas

International Journal of Hybrid Intelligent Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a kernel-based approach to reinforcement learning that overcomes the stability problems of temporal-difference learning in continuous state-spaces. First, our algorithm converges to a unique solution of an approximate Bellman's equation regardless of its initialization values. Second, the method is consistent in the sense that the resulting policy converges asymptotically to the optimal policy. Parametric value function estimates such as neural networks do not possess this property. Our kernel-based approach also allows us to show that the limiting distribution of the value function estimate is a Gaussian process. This information is useful in studying the bias-variance tradeoff in reinforcement learning. We find that all reinforcement learning approaches to estimating the value function, parametric or non-parametric, are subject to a bias. This bias is typically larger in reinforcement learning than in a comparable regression problem.