Sparse Kernel-SARSA(λ) with an eligibility trace

Authors:
Matthew Robards;Peter Sunehag;Scott Sanner;Bhaskara Marthi
Affiliations:
National ICT Australia and Research School of Computer Science, Australian National University, Canberra, ACT, Australia;Research School of Computer Science, Australian National University, Canberra, ACT, Australia;National ICT Australia and Research School of Computer Science, Australian National University, Canberra, ACT, Australia;Willow Garage, Inc., Menlo Park, CA
Venue:
ECML PKDD'11 Proceedings of the 2011 European conference on Machine learning and knowledge discovery in databases - Volume Part III
Year:
2011

Citing 13
Cited 1

Markov Decision Processes: Discrete Stochastic Dynamic Programming

Markov Decision Processes: Discrete Stochastic Dynamic Programming
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond

Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond
Kernel-Based Reinforcement Learning

Machine Learning
Sparse Online Greedy Support Vector Regression

ECML '02 Proceedings of the 13th European Conference on Machine Learning
Kernel Methods for Pattern Analysis

Kernel Methods for Pattern Analysis
Reinforcement learning with Gaussian processes

ICML '05 Proceedings of the 22nd international conference on Machine learning
Pattern Recognition and Machine Learning (Information Science and Statistics)

Pattern Recognition and Machine Learning (Information Science and Statistics)
The projectron: a bounded kernel-based Perceptron

Proceedings of the 25th international conference on Machine learning
Kernelized value function approximation for reinforcement learning

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Gaussian processes for sample efficient reinforcement learning with RMAX-like exploration

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part I
A sparse kernel-based least-squares temporal difference algorithm for reinforcement learning

ICNC'06 Proceedings of the Second international conference on Advances in Natural Computation - Volume Part I
Kernel-Based Least Squares Policy Iteration for Reinforcement Learning

IEEE Transactions on Neural Networks

Gradient based algorithms with loss functions and kernels for improved on-policy control

EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

We introduce the first online kernelized version of SARSA(λ) to permit sparsification for arbitrary λ for 0 ≤ λ ≤ 1; this is possible via a novel kernelization of the eligibility trace that is maintained separately from the kernelized value function. This separation is crucial for preserving the functional structure of the eligibility trace when using sparse kernel projection techniques that are essential for memory efficiency and capacity control. The result is a simple and practical Kernel-SARSA(λ) algorithm for general 0 ≤ λ ≤ 1 that is memory-efficient in comparison to standard SARSA(λ) (using various basis functions) on a range of domains including a real robotics task running on a Willow Garage PR2 robot.