An analysis of linear models, linear value-function approximation, and feature selection for reinforcement learning

Authors:
Ronald Parr;Lihong Li;Gavin Taylor;Christopher Painter-Wakefield;Michael L. Littman
Affiliations:
Duke University, Durham, NC;Rutgers University, Piscataway, NJ;Duke University, Durham, NC;Duke University, Durham, NC;Rutgers University, Piscataway, NJ
Venue:
Proceedings of the 25th international conference on Machine learning
Year:
2008

Citing 11
Cited 15

Linear least-squares algorithms for temporal difference learning

Machine Learning - Special issue on reinforcement learning
Reinforcement Learning

Reinforcement Learning
Learning to Predict by the Methods of Temporal Differences

Machine Learning
Least-Squares Temporal Difference Learning

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Computing Factored Value Functions for Policies in Structured MDPs

IJCAI '99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence
Least-squares policy iteration

The Journal of Machine Learning Research
Automatic basis function construction for approximate dynamic programming and reinforcement learning

ICML '06 Proceedings of the 23rd international conference on Machine learning
Analyzing feature generation for value-function approximation

Proceedings of the 24th international conference on Machine learning
Proto-value Functions: A Laplacian Framework for Learning Representation and Control in Markov Decision Processes

The Journal of Machine Learning Research
An analysis of Laplacian methods for value function approximation in MDPs

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Model minimization in Markov decision processes

AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence

Kernelized value function approximation for reinforcement learning

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Learning Representation and Control in Markov Decision Processes: New Frontiers

Foundations and Trends® in Machine Learning
Predictive projections

IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Multi-task evolutionary shaping without pre-specified representations

Proceedings of the 12th annual conference on Genetic and evolutionary computation
Linear options

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Approximate dynamic programming with affine ADDs

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Feature selection for reinforcement learning: evaluating implicit state-reward dependency via conditional mutual information

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part I
Automatic induction of bellman-error features for probabilistic planning

Journal of Artificial Intelligence Research
ℓ1-Penalized projected bellman residual

EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
Multi-Task reinforcement learning: shaping and feature selection

EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
Policy iteration based on a learned transition model

ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part II
A reinforcement learning approach to autonomous decision-making in smart electricity markets

Machine Learning
Construction of approximation spaces for reinforcement learning

The Journal of Machine Learning Research
Learning potential functions and their representations for multi-task reinforcement learning

Autonomous Agents and Multi-Agent Systems
Policy oscillation is overshooting

Neural Networks

Quantified Score

Hi-index	0.00

Visualization

Abstract

We show that linear value-function approximation is equivalent to a form of linear model approximation. We then derive a relationship between the model-approximation error and the Bellman error, and show how this relationship can guide feature selection for model improvement and/or value-function improvement. We also show how these results give insight into the behavior of existing feature-selection algorithms.