Automatic basis function construction for approximate dynamic programming and reinforcement learning

Authors:
Philipp W. Keller;Shie Mannor;Doina Precup
Affiliations:
McGill University, Montreal, QC, Canada;McGill University, Montreal, QC, Canada;McGill University, Montreal, QC, Canada
Venue:
ICML '06 Proceedings of the 23rd international conference on Machine learning
Year:
2006

Citing 7
Cited 32

Linear least-squares algorithms for temporal difference learning

Machine Learning - Special issue on reinforcement learning
On the Convergence of Temporal-Difference Learning with Linear Function Approximation

Machine Learning
Reinforcement Learning

Reinforcement Learning
Technical Update: Least-Squares Temporal Difference Learning

Machine Learning
Variable Resolution Discretization in Optimal Control

Machine Learning
Learning to Predict by the Methods of Temporal Differences

Machine Learning
Experiments with random projections for machine learning

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining

Constructing basis functions from directed graphs for value function approximation

Proceedings of the 24th international conference on Machine learning
Learning state-action basis functions for hierarchical MDPs

Proceedings of the 24th international conference on Machine learning
Analyzing feature generation for value-function approximation

Proceedings of the 24th international conference on Machine learning
Reinforcement learning for a biped robot based on a CPG-actor-critic method

Neural Networks
Face recognition using classification-based linear projections

EURASIP Journal on Advances in Signal Processing
An analysis of linear models, linear value-function approximation, and feature selection for reinforcement learning

Proceedings of the 25th international conference on Machine learning
Robust Population Coding in Free-Energy-Based Reinforcement Learning

ICANN '08 Proceedings of the 18th international conference on Artificial Neural Networks, Part I
Basis Expansion in Natural Actor Critic Methods

Recent Advances in Reinforcement Learning
A Study of Reinforcement Learning in a New Multiagent Domain

WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 02
Projected equation methods for approximate solution of large linear systems

Journal of Computational and Applied Mathematics
Fuzzy CMAC with automatic state partition for reinforcementlearning

Proceedings of the first ACM/SIGEVO Summit on Genetic and Evolutionary Computation
Regularization and feature selection in least-squares temporal difference learning

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Fuzzy Kanerva-based function approximation for reinforcement learning

Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
Feature Selection for Value Function Approximation Using Bayesian Model Selection

ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part I
Compact character controllers

ACM SIGGRAPH Asia 2009 papers
Adaptive Fuzzy Function Approximation for Multi-agent Reinforcement Learning

WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 02
Predictive projections

IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
ReTrASE: integrating paradigms for approximate probabilistic planning

IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
An Additive Reinforcement Learning

ICANN '09 Proceedings of the 19th International Conference on Artificial Neural Networks: Part I
Q-learning with linear function approximation

COLT'07 Proceedings of the 20th annual conference on Learning theory
Character animation in two-player adversarial games

ACM Transactions on Graphics (TOG)
Critical factors in the empirical performance of temporal difference and evolutionary methods for reinforcement learning

Autonomous Agents and Multi-Agent Systems
Basis function construction for hierarchical reinforcement learning

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Feature selection for reinforcement learning: evaluating implicit state-reward dependency via conditional mutual information

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part I
Automatic induction of bellman-error features for probabilistic planning

Journal of Artificial Intelligence Research
Basis function discovery using spectral clustering and bisimulation metrics

The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 3
Basis function discovery using spectral clustering and bisimulation metrics

ALA'11 Proceedings of the 11th international conference on Adaptive and Learning Agents
Automatic state abstraction from demonstration

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Two
Discovering hidden structure in factored MDPs

Artificial Intelligence
Automatic task decomposition and state abstraction from demonstration

Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Linear fitted-Q iteration with multiple reward functions

The Journal of Machine Learning Research
Kernel regression with sparse metric learning

Journal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology

Quantified Score

Hi-index	0.00

Visualization

Abstract

We address the problem of automatically constructing basis functions for linear approximation of the value function of a Markov Decision Process (MDP). Our work builds on results by Bertsekas and Castañon (1989) who proposed a method for automatically aggregating states to speed up value iteration. We propose to use neighborhood component analysis (Goldberger et al., 2005), a dimensionality reduction technique created for supervised learning, in order to map a high-dimensional state space to a low-dimensional space, based on the Bellman error, or on the temporal difference (TD) error. We then place basis function in the lower-dimensional space. These are added as new features for the linear function approximator. This approach is applied to a high-dimensional inventory control problem.