Automatic discovery and transfer of MAXQ hierarchies

Authors:
Neville Mehta;Soumya Ray;Prasad Tadepalli;Thomas Dietterich
Affiliations:
Oregon State University, Corvallis, OR;Oregon State University, Corvallis, OR;Oregon State University, Corvallis, OR;Oregon State University, Corvallis, OR
Venue:
Proceedings of the 25th international conference on Machine learning
Year:
2008

Citing 10
Cited 8

Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning

Artificial Intelligence
Q-Cut - Dynamic Discovery of Sub-goals in Reinforcement Learning

ECML '02 Proceedings of the 13th European Conference on Machine Learning
Automatic Discovery of Subgoals in Reinforcement Learning using Diverse Density

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
PolicyBlocks: An Algorithm for Creating Useful Macro-Actions in Reinforcement Learning

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Discovering Hierarchy in Reinforcement Learning with HEXQ

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Hierarchical Explanation-Based Reinforcement Learning

ICML '97 Proceedings of the Fourteenth International Conference on Machine Learning
State abstraction for programmable reinforcement learning agents

Eighteenth national conference on Artificial intelligence
A hierarchical approach to efficient reinforcement learning in deterministic domains

AAMAS '06 Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
Causal Graph Based Decomposition of Factored MDPs

The Journal of Machine Learning Research
Hierarchical reinforcement learning with the MAXQ value function decomposition

Journal of Artificial Intelligence Research

Discovering options from example trajectories

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Optimal policy switching algorithms for reinforcement learning

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Robot learning from demonstration by constructing skill trees

International Journal of Robotics Research
DetH: approximate hierarchical solution of large Markov decision processes

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
Automatic construction of temporally extended actions for MDPs using bisimulation metrics

EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
An extension of a hierarchical reinforcement learning algorithm for multiagent settings

EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
Transfer in reinforcement learning via shared features

The Journal of Machine Learning Research
Learning high-level planning from text

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present an algorithm, HI-MAT (Hierarchy Induction via Models And Trajectories), that discovers MAXQ task hierarchies by applying dynamic Bayesian network models to a successful trajectory from a source reinforcement learning task. HI-MAT discovers subtasks by analyzing the causal and temporal relationships among the actions in the trajectory. Under appropriate assumptions, HI-MAT induces hierarchies that are consistent with the observed trajectory and have compact value-function tables employing safe state abstractions. We demonstrate empirically that HI-MAT constructs compact hierarchies that are comparable to manually-engineered hierarchies and facilitate significant speedup in learning when transferred to a target task.