Technical Note: \cal Q-Learning
Machine Learning
Efficient learning and planning within the Dyna framework
Proceedings of the second international conference on From animals to animats 2 : simulation of adaptive behavior: simulation of adaptive behavior
Feature-based methods for large scale dynamic programming
Machine Learning - Special issue on reinforcement learning
Locally Weighted Learning for Control
Artificial Intelligence Review - Special issue on lazy learning
Dynamic Programming and Optimal Control, Two Volume Set
Dynamic Programming and Optimal Control, Two Volume Set
Markov Decision Processes: Discrete Stochastic Dynamic Programming
Markov Decision Processes: Discrete Stochastic Dynamic Programming
Kernel-Based Reinforcement Learning
Machine Learning
Learning to Predict by the Methods of Temporal Differences
Machine Learning
Practical Reinforcement Learning in Continuous Spaces
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Reinforcement Learning Applied to Linear Quadratic Regulation
Advances in Neural Information Processing Systems 5, [NIPS Conference]
Convergence of Reinforcement Learning with General Function Approximators
IJCAI '99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence
Dynamic Programming
Approximate solutions to markov decision processes
Approximate solutions to markov decision processes
Kernel-Based Reinforcement Learning
Machine Learning
Least-squares policy iteration
The Journal of Machine Learning Research
Interpolation-based Q-learning
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Standard and averaging reinforcement learning in XCS
Proceedings of the 8th annual conference on Genetic and evolutionary computation
Kernel rewards regression: an information efficient batch policy iteration approach
AIA'06 Proceedings of the 24th IASTED international conference on Artificial intelligence and applications
The Journal of Machine Learning Research
Model-based function approximation in reinforcement learning
Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
Restricted gradient-descent algorithm for value-function approximation in reinforcement learning
Artificial Intelligence
An analysis of reinforcement learning with function approximation
Proceedings of the 25th international conference on Machine learning
Finite-Time Bounds for Fitted Value Iteration
The Journal of Machine Learning Research
Autonomous transfer for reinforcement learning
Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
Efficient Continuous-Time Reinforcement Learning with Adaptive State Graphs
ECML '07 Proceedings of the 18th European conference on Machine Learning
Fitted Natural Actor-Critic: A New Algorithm for Continuous State-Action MDPs
ECML PKDD '08 Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases - Part II
Regularized Fitted Q-Iteration: Application to Planning
Recent Advances in Reinforcement Learning
Learning and planning in environments with delayed feedback
Autonomous Agents and Multi-Agent Systems
Gaussian process dynamic programming
Neurocomputing
Reinforcement Learning: A Tutorial Survey and Recent Advances
INFORMS Journal on Computing
A Simulation-based Approach for Solving Generalized Semi-Markov Decision Processes
Proceedings of the 2008 conference on ECAI 2008: 18th European Conference on Artificial Intelligence
Learning Representation and Control in Markov Decision Processes: New Frontiers
Foundations and Trends® in Machine Learning
An instance-based state representation for network repair
AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Compositional Models for Reinforcement Learning
ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part I
Adaptive treatment of epilepsy via batch-mode reinforcement learning
IAAI'08 Proceedings of the 20th national conference on Innovative applications of artificial intelligence - Volume 3
Reinforcement learning versus model predictive control: a comparison on a power system problem
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Transfer Learning for Reinforcement Learning Domains: A Survey
The Journal of Machine Learning Research
Coordinated learning in multiagent MDPs with infinite state-space
Autonomous Agents and Multi-Agent Systems
Continuous-state reinforcement learning with fuzzy approximation
ALAMAS'05/ALAMAS'06/ALAMAS'07 Proceedings of the 5th , 6th and 7th European conference on Adaptive and learning agents and multi-agent systems: adaptation and multi-agent learning
Sparse Kernel-SARSA(λ) with an eligibility trace
ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part III
A sparse kernel-based least-squares temporal difference algorithm for reinforcement learning
ICNC'06 Proceedings of the Second international conference on Advances in Natural Computation - Volume Part I
Continuous character control with low-dimensional embeddings
ACM Transactions on Graphics (TOG) - SIGGRAPH 2012 Conference Proceedings
A novel feature sparsification method for kernel-based approximate policy iteration
ISNN'12 Proceedings of the 9th international conference on Advances in Neural Networks - Volume Part I
SMART: A Stochastic Multiscale Model for the Analysis of Energy Resources, Technology, and Policy
INFORMS Journal on Computing
Learn to swing up and balance a real pole based on raw visual input data
ICONIP'12 Proceedings of the 19th international conference on Neural Information Processing - Volume Part V
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Low-discrepancy sampling for approximate dynamic programming with local approximators
Computers and Operations Research
Reinforcement learning algorithms with function approximation: Recent advances and applications
Information Sciences: an International Journal
Policy oscillation is overshooting
Neural Networks
Adaptive function approximation in reinforcement learning with an interpolating growing neural gas
International Journal of Hybrid Intelligent Systems
Hi-index | 0.00 |
We present a kernel-based approach to reinforcement learning that overcomes the stability problems of temporal-difference learning in continuous state-spaces. First, our algorithm converges to a unique solution of an approximate Bellman's equation regardless of its initialization values. Second, the method is consistent in the sense that the resulting policy converges asymptotically to the optimal policy. Parametric value function estimates such as neural networks do not possess this property. Our kernel-based approach also allows us to show that the limiting distribution of the value function estimate is a Gaussian process. This information is useful in studying the bias-variance tradeoff in reinforcement learning. We find that all reinforcement learning approaches to estimating the value function, parametric or non-parametric, are subject to a bias. This bias is typically larger in reinforcement learning than in a comparable regression problem.