Geodesic Gaussian kernels for value function approximation

Authors:
Masashi Sugiyama;Hirotaka Hachiya;Christopher Towell;Sethu Vijayakumar
Affiliations:
Department of Computer Science, Tokyo Institute of Technology, Tokyo, Japan 152-8552 and School of Informatics, University of Edinburgh, Edinburgh EH9, UK 3JZ;Department of Computer Science, Tokyo Institute of Technology, Tokyo, Japan 152-8552;School of Informatics, University of Edinburgh, Edinburgh EH9, UK 3JZ;School of Informatics, University of Edinburgh, Edinburgh EH9, UK 3JZ
Venue:
Autonomous Robots
Year:
2008

Citing 14
Cited 4

Fibonacci heaps and their uses in improved network optimization algorithms

Journal of the ACM (JACM)
Ten lectures on wavelets

Ten lectures on wavelets
Regularization theory and neural networks architectures

Neural Computation
Self-organizing maps

Self-organizing maps
Neural Networks for Pattern Recognition

Neural Networks for Pattern Recognition
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Statistical Learning for Humanoid Robots

Autonomous Robots
Eligibility Traces for Off-Policy Policy Evaluation

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Least-squares policy iteration

The Journal of Machine Learning Research
Computing the shortest path: A search meets graph theory

SODA '05 Proceedings of the sixteenth annual ACM-SIAM symposium on Discrete algorithms
Reinforcement learning with Gaussian processes

ICML '05 Proceedings of the 22nd international conference on Machine learning
Proto-value functions: developmental reinforcement learning

ICML '05 Proceedings of the 22nd international conference on Machine learning
Learning state-action basis functions for hierarchical MDPs

Proceedings of the 24th international conference on Machine learning
Adaptive importance sampling with automatic model selection in value function approximation

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 3

Human Age Estimation by Metric Learning for Regression Problems

EMMCVPR '09 Proceedings of the 7th International Conference on Energy Minimization Methods in Computer Vision and Pattern Recognition
Learning distance metric for regression by semidefinite programming with application to human age estimation

MM '09 Proceedings of the 17th ACM international conference on Multimedia
Metric Learning for Regression Problems and Human Age Estimation

PCM '09 Proceedings of the 10th Pacific Rim Conference on Multimedia: Advances in Multimedia Information Processing
Improving Gaussian process value function approximation in policy gradient algorithms

ICANN'11 Proceedings of the 21st international conference on Artificial neural networks - Volume Part II

Quantified Score

Hi-index	0.00

Visualization

Abstract

The least-squares policy iteration approach works efficiently in value function approximation, given appropriate basis functions. Because of its smoothness, the Gaussian kernel is a popular and useful choice as a basis function. However, it does not allow for discontinuity which typically arises in real-world reinforcement learning tasks. In this paper, we propose a new basis function based on geodesic Gaussian kernels, which exploits the non-linear manifold structure induced by the Markov decision processes. The usefulness of the proposed method is successfully demonstrated in simulated robot arm control and Khepera robot navigation.