Basis function construction for hierarchical reinforcement learning

Authors:
Sarah Osentoski;Sridhar Mahadevan
Affiliations:
Brown University, Providence, RI;University of Massachusetts, Amherst, Amherst, MA
Venue:
Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Year:
2010

Citing 11
Cited 0

Reinforcement learning with hierarchies of machines

NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning

Artificial Intelligence
Recent Advances in Hierarchical Reinforcement Learning

Discrete Event Dynamic Systems
The MAXQ Method for Hierarchical Reinforcement Learning

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Proto-value functions: developmental reinforcement learning

ICML '05 Proceedings of the 22nd international conference on Machine learning
Automatic basis function construction for approximate dynamic programming and reinforcement learning

ICML '06 Proceedings of the 23rd international conference on Machine learning
Learning state-action basis functions for hierarchical MDPs

Proceedings of the 24th international conference on Machine learning
Analyzing feature generation for value-function approximation

Proceedings of the 24th international conference on Machine learning
Hierarchical Average Reward Reinforcement Learning

The Journal of Machine Learning Research
Hierarchical reinforcement learning with the MAXQ value function decomposition

Journal of Artificial Intelligence Research
An analysis of Laplacian methods for value function approximation in MDPs

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Much past work on solving Markov decision processes (MDPs) using reinforcement learning (RL) has relied on combining parameter estimation methods with hand-designed function approximation architectures for representing value functions. Recently, there has been growing interest in a broader framework that combines representation discovery and control learning, where value functions are approximated using a linear combination of task-dependent basis functions learned during the course of solving a particular MDP. This paper introduces an approach to automatic basis function construction for hierarchical reinforcement learning (HRL). Our approach generalizes past work on basis construction to multi-level action hierarchies by forming a compressed representation of a semi-Markov decision process (SMDP) at multiple levels of temporal abstraction. The specific approach is based on hierarchical spectral analysis of graphs induced on an SMDP's state space from sample trajectories. We present experimental results on benchmark SMDPs, showing significant speedups when compared to hand-designed approximation architectures.