Estimating uncertain spatial relationships in robotics
Autonomous robot vehicles
Learning invariance from transformation sequences
Neural Computation
Technical Note: \cal Q-Learning
Machine Learning
Linear least-squares algorithms for temporal difference learning
Machine Learning - Special issue on reinforcement learning
Nonlinear component analysis as a kernel eigenvalue problem
Neural Computation
Neural Networks: A Comprehensive Foundation
Neural Networks: A Comprehensive Foundation
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Neuro-Dynamic Programming
Sparse on-line Gaussian processes
Neural Computation
Slow feature analysis: unsupervised learning of invariances
Neural Computation
Sparse Greedy Matrix Approximation for Machine Learning
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Laplacian Eigenmaps for dimensionality reduction and data representation
Neural Computation
Object Recognition from Local Scale-Invariant Features
ICCV '99 Proceedings of the International Conference on Computer Vision-Volume 2 - Volume 2
Slow feature analysis: a theoretical analysis of optimal free responses
Neural Computation
Real-Time Simultaneous Localisation and Mapping with a Single Camera
ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Least-squares policy iteration
The Journal of Machine Learning Research
The Linear Programming Approach to Approximate Dynamic Programming
Operations Research
Convex Optimization
A spatio-temporal extension to Isomap nonlinear dimension reduction
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Solving factored MDPs with continuous and discrete variables
UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
ICML '05 Proceedings of the 22nd international conference on Machine learning
A fast learning algorithm for deep belief nets
Neural Computation
Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning)
Gaussian Processes for Machine Learning (Adaptive Computation and Machine Learning)
Analyzing feature generation for value-function approximation
Proceedings of the 24th international conference on Machine learning
The Journal of Machine Learning Research
Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
Proceedings of the 25th international conference on Machine learning
Dynamic Programming and Optimal Control, Vol. II
Dynamic Programming and Optimal Control, Vol. II
Transfer of task representation in reinforcement learning using policy-based proto-value functions
Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 3
Transfer in variable-reward hierarchical reinforcement learning
Machine Learning
Closed-loop learning of visual control policies
Journal of Artificial Intelligence Research
Reinforcement learning: a survey
Journal of Artificial Intelligence Research
Computing factored value functions for policies in structured MDPs
IJCAI'99 Proceedings of the 16th international joint conference on Artificial intelligence - Volume 2
An analysis of Laplacian methods for value function approximation in MDPs
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Max-norm projections for factored MDPs
IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 1
Planning and acting in partially observable stochastic domains
Artificial Intelligence
IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Transfer Learning for Reinforcement Learning Domains: A Survey
The Journal of Machine Learning Research
Policy iteration for factored MDPs
UAI'00 Proceedings of the Sixteenth conference on Uncertainty in artificial intelligence
Robust Approximate Bilinear Programming for Value Function Approximation
The Journal of Machine Learning Research
A sparse kernel-based least-squares temporal difference algorithm for reinforcement learning
ICNC'06 Proceedings of the Second international conference on Advances in Natural Computation - Volume Part I
On the relation of slow feature analysis and laplacian eigenmaps
Neural Computation
Matching pursuits with time-frequency dictionaries
IEEE Transactions on Signal Processing
Multi-Task reinforcement learning: shaping and feature selection
EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
ICANN'12 Proceedings of the 22nd international conference on Artificial Neural Networks and Machine Learning - Volume Part II
Hi-index | 0.00 |
Linear reinforcement learning (RL) algorithms like least-squares temporal difference learning (LSTD) require basis functions that span approximation spaces of potential value functions. This article investigates methods to construct these bases from samples. We hypothesize that an ideal approximation spaces should encode diffusion distances and that slow feature analysis (SFA) constructs such spaces. To validate our hypothesis we provide theoretical statements about the LSTD value approximation error and induced metric of approximation spaces constructed by SFA and the state-of-the-art methods Krylov bases and proto-value functions (PVF). In particular, we prove that SFA minimizes the average (over all tasks in the same environment) bound on the above approximation error. Compared to other methods, SFA is very sensitive to sampling and can sometimes fail to encode the whole state space. We derive a novel importance sampling modification to compensate for this effect. Finally, the LSTD and least squares policy iteration (LSPI) performance of approximation spaces constructed by Krylov bases, PVF, SFA and PCA is compared in benchmark tasks and a visual robot navigation experiment (both in a realistic simulation and with a robot). The results support our hypothesis and suggest that (i) SFA provides subspace-invariant features for MDPs with self-adjoint transition operators, which allows strong guarantees on the approximation error, (ii) the modified SFA algorithm is best suited for LSPI in both discrete and continuous state spaces and (iii) approximation spaces encoding diffusion distances facilitate LSPI performance.