Learning state-action basis functions for hierarchical MDPs

Authors:
Sarah Osentoski;Sridhar Mahadevan
Affiliations:
University of Massachusetts Amherst, Amherst, MA;University of Massachusetts Amherst, Amherst, MA
Venue:
Proceedings of the 24th international conference on Machine learning
Year:
2007

Citing 9
Cited 4

A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs

SIAM Journal on Scientific Computing
Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning

Artificial Intelligence
Eligibility Traces for Off-Policy Policy Evaluation

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Proto-value functions: developmental reinforcement learning

ICML '05 Proceedings of the 22nd international conference on Machine learning
Automatic basis function construction for approximate dynamic programming and reinforcement learning

ICML '06 Proceedings of the 23rd international conference on Machine learning
Constructing basis functions from directed graphs for value function approximation

Proceedings of the 24th international conference on Machine learning
Learning representation and control in continuous Markov decision processes

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Compact spectral bases for value function approximation using Kronecker factorization

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
SMDP homomorphisms: an algebraic approach to abstraction in semi-Markov decision processes

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence

Transfer of task representation in reinforcement learning using policy-based proto-value functions

Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 3
Geodesic Gaussian kernels for value function approximation

Autonomous Robots
Learning Representation and Control in Markov Decision Processes: New Frontiers

Foundations and Trends® in Machine Learning
Basis function construction for hierarchical reinforcement learning

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper introduces a new approach to action-value function approximation by learning basis functions from a spectral decomposition of the state-action manifold. This paper extends previous work on using Laplacian bases for value function approximation by using the actions of the agent as part of the representation when creating basis functions. The approach results in a nonlinear learned representation particularly suited to approximating action-value functions, without incurring the wasteful duplication of state bases in previous work. We discuss two techniques to create state-action graphs: off-policy and on-policy. We show that these graphs have a greater expressive power and have better performance over state-based Laplacian basis functions in domains modeled as Semi-Markov Decision Processes (SMDPs). We present a simple graph partitioning method to scale the approach to large discrete MDPs.