Neural Networks: A Comprehensive Foundation
Neural Networks: A Comprehensive Foundation
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Least-squares policy iteration
The Journal of Machine Learning Research
A differential evolution based incremental training method for RBF networks
GECCO '05 Proceedings of the 7th annual conference on Genetic and evolutionary computation
Fast learning in networks of locally-tuned processing units
Neural Computation
Kernel Width Optimization for Faulty RBF Neural Networks with Multi-node Open Fault
Neural Processing Letters
Kernel-Based Least Squares Policy Iteration for Reinforcement Learning
IEEE Transactions on Neural Networks
Hi-index | 0.00 |
The Kernel-based Least-squares Policy Iteration (KLSPI) algorithm provides a general reinforcement learning solution for large-scale Markov decision problems. In KLSPI, the Radial Basis Function (RBF) kernel is usually used to approximate the optimal value-function with high precision. However, selecting a proper kernel-width for the RBF kernel function is very important for KLSPI to be adopted successfully. In previous research, the kernel-width was usually set manually or calculated according to the sample distribution in advance, which requires prior knowledge or model information. In this paper, an adaptive kernel-width selection method is proposed for the KLSPI algorithm. Firstly, a sparsification procedure with neighborhood analysis based on the l2- ball of radius ε is adopted, which helps obtain a reduced kernel dictionary without presetting the kernel-width. Secondly, a gradient descent method based on the Bellman Residual Error (BRE) is proposed so as to find out a kernelwidth minimizing the sum of the BRE. The experimental results show the proposed method can help KLSPI approximate the true value-function more accurately, and, finally, obtain a better control policy.