TD-Gammon, a self-teaching backgammon program, achieves master-level play
Neural Computation
The nature of statistical learning theory
The nature of statistical learning theory
Elevator Group Control Using Multiple Reinforcement Learning Agents
Machine Learning
Markov Decision Processes: Discrete Stochastic Dynamic Programming
Markov Decision Processes: Discrete Stochastic Dynamic Programming
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Neuro-Dynamic Programming
Technical Update: Least-Squares Temporal Difference Learning
Machine Learning
Least-squares policy iteration
The Journal of Machine Learning Research
Efficient reinforcement learning using recursive least-squares methods
Journal of Artificial Intelligence Research
Reinforcement learning: a survey
Journal of Artificial Intelligence Research
Infinite-horizon policy-gradient estimation
Journal of Artificial Intelligence Research
A reinforcement learning approach to job-shop scheduling
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
The kernel recursive least-squares algorithm
IEEE Transactions on Signal Processing
Learning to trade via direct reinforcement
IEEE Transactions on Neural Networks
Kernel-Based Least Squares Policy Iteration for Reinforcement Learning
IEEE Transactions on Neural Networks
Hi-index | 0.00 |
Recently approximate policy iteration (API) has received increasing attention due to its good convergence and generalization abilities in solving difficult reinforcement learning (RL) problems, e.g. least-squares policy iteration (LSPI) and its kernelized version (KLSPI). However, the sparsification of feature vectors, especially the kernel-based features, costs much computation and greatly influences the performance of API methods. In this paper, a novel rapid sparsification method is proposed for sparsifying kernel machines in API. In this method, the approximation error of a new feature vector is computed prior in the original space to decide if it is added to the current kernel dictionary, so the computational cost becomes a little higher when the collected samples are sparse, but remarkably lower when the collected samples are dense. Experimental results on the swing-up control of an double-link pendulum verify that the computational cost of the proposed algorithm is lower than that of the previous kernel-based API algorithm, and this performance becomes more and more obvious when the number of the collected samples increases and when the level of sparsification increases.