A rapid sparsification method for kernel machines in approximate policy iteration

Authors:
Chunming Liu;Zhenhua Huang;Xin Xu;Lei Zuo;Jun Wu
Affiliations:
Institute of Automation, National University of Defense Technology, Changsha, P.R. China;Institute of Automation, National University of Defense Technology, Changsha, P.R. China;Institute of Automation, National University of Defense Technology, Changsha, P.R. China;Institute of Automation, National University of Defense Technology, Changsha, P.R. China;Institute of Automation, National University of Defense Technology, Changsha, P.R. China
Venue:
ISNN'12 Proceedings of the 9th international conference on Advances in Neural Networks - Volume Part I
Year:
2012

Citing 15
Cited 0

TD-Gammon, a self-teaching backgammon program, achieves master-level play

Neural Computation
The nature of statistical learning theory

The nature of statistical learning theory
Elevator Group Control Using Multiple Reinforcement Learning Agents

Machine Learning
Markov Decision Processes: Discrete Stochastic Dynamic Programming

Markov Decision Processes: Discrete Stochastic Dynamic Programming
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Neuro-Dynamic Programming

Neuro-Dynamic Programming
Technical Update: Least-Squares Temporal Difference Learning

Machine Learning
Least-squares policy iteration

The Journal of Machine Learning Research
Efficient reinforcement learning using recursive least-squares methods

Journal of Artificial Intelligence Research
Reinforcement learning: a survey

Journal of Artificial Intelligence Research
Infinite-horizon policy-gradient estimation

Journal of Artificial Intelligence Research
A reinforcement learning approach to job-shop scheduling

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
The kernel recursive least-squares algorithm

IEEE Transactions on Signal Processing
Learning to trade via direct reinforcement

IEEE Transactions on Neural Networks
Kernel-Based Least Squares Policy Iteration for Reinforcement Learning

IEEE Transactions on Neural Networks

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recently approximate policy iteration (API) has received increasing attention due to its good convergence and generalization abilities in solving difficult reinforcement learning (RL) problems, e.g. least-squares policy iteration (LSPI) and its kernelized version (KLSPI). However, the sparsification of feature vectors, especially the kernel-based features, costs much computation and greatly influences the performance of API methods. In this paper, a novel rapid sparsification method is proposed for sparsifying kernel machines in API. In this method, the approximation error of a new feature vector is computed prior in the original space to decide if it is added to the current kernel dictionary, so the computational cost becomes a little higher when the collected samples are sparse, but remarkably lower when the collected samples are dense. Experimental results on the swing-up control of an double-link pendulum verify that the computational cost of the proposed algorithm is lower than that of the previous kernel-based API algorithm, and this performance becomes more and more obvious when the number of the collected samples increases and when the level of sparsification increases.