Samuel meets Amarel: automating value function approximation using global state space analysis

Authors:
Sridhar Mahadevan
Affiliations:
Department of Computer Science, University of Massachusetts, Amherst, MA
Venue:
AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Year:
2005

Citing 15
Cited 13

Normalized Cuts and Image Segmentation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Neuro-Dynamic Programming

Neuro-Dynamic Programming
Many-layered learning

Neural Computation
Policy Iteration for Factored MDPs

UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
Greedy linear value-approximation for factored Markov decision processes

Eighteenth national conference on Artificial intelligence
Fast Monte-Carlo Algorithms for finding low-rank approximations

FOCS '98 Proceedings of the 39th Annual Symposium on Foundations of Computer Science
A theory of justified reformulations

A theory of justified reformulations
Autonomous discovery of temporal abstractions from interaction with an environment

Autonomous discovery of temporal abstractions from interaction with an environment
Spectral Grouping Using the Nyström Method

IEEE Transactions on Pattern Analysis and Machine Intelligence
Least-squares policy iteration

The Journal of Machine Learning Research
Semi-Supervised Learning on Riemannian Manifolds

Machine Learning
Dynamic abstraction in reinforcement learning via clustering

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Efficient solution algorithms for factored MDPs

Journal of Artificial Intelligence Research
SMDP homomorphisms: an algebraic approach to abstraction in semi-Markov decision processes

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence

Evolutionary Function Approximation for Reinforcement Learning

The Journal of Machine Learning Research
Empirical Studies in Action Selection with Reinforcement Learning

Adaptive Behavior - Animals, Animats, Software Agents, Robots, Adaptive Systems
Model-based function approximation in reinforcement learning

Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
Robust Population Coding in Free-Energy-Based Reinforcement Learning

ICANN '08 Proceedings of the 18th international conference on Artificial Neural Networks, Part I
A Study of Reinforcement Learning in a New Multiagent Domain

WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 02
Practical solution techniques for first-order MDPs

Artificial Intelligence
Fuzzy CMAC with automatic state partition for reinforcementlearning

Proceedings of the first ACM/SIGEVO Summit on Genetic and Evolutionary Computation
Learning basis functions in hybrid domains

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Solving factored MDPs with hybrid state and action variables

Journal of Artificial Intelligence Research
An analysis of Laplacian methods for value function approximation in MDPs

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Critical factors in the empirical performance of temporal difference and evolutionary methods for reinforcement learning

Autonomous Agents and Multi-Agent Systems
Using mathematical programming to solve Factored Markov Decision Processes with Imprecise Probabilities

International Journal of Approximate Reasoning
An online kernel-based clustering approach for value function approximation

SETN'12 Proceedings of the 7th Hellenic conference on Artificial Intelligence: theories and applications

Quantified Score

Hi-index	0.00

Visualization

Abstract

Most work on value function approximation adheres to Samuel's original design: agents learn a task-specific value function using parameter estimation, where the approximation architecture (e.g, polynomials) is specified by a human designer. This paper proposes a novel framework generalizing Samuel's paradigm using a coordinate-free approach to value function approximation. Agents learn both representations and value functions by constructing geometrically customized task-independent basis functions that form an orthonormal set for the Hilbert space of smooth functions on the underlying state space manifold. The approach rests on a technical result showing that the space of smooth functions on a (compact) Riemanian manifold has a discrete spectrum associated with the Laplace-Beltrami operator. In the discrete setting, spectral analysis of the graph Laplacian yields a set of geometrically customized basis functions for approximating and decomposing value functions. The proposed framework generalizes Samuel's value function approximation paradigm by combining it with a formalization of Saul Amarel's paradigm of representation learning through global state space analysis.