DetH: approximate hierarchical solution of large Markov decision processes

Authors:
Jennifer L. Barry;Leslie Pack Kaelbling;Tomás Lozano-Pérez
Affiliations:
MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, MA;MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, MA;MIT Computer Science and Artificial Intelligence Laboratory, Cambridge, MA
Venue:
IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
Year:
2011

Citing 8
Cited 3

Neuro-Dynamic Programming

Neuro-Dynamic Programming
The MAXQ Method for Hierarchical Reinforcement Learning

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Automatic discovery and transfer of MAXQ hierarchies

Proceedings of the 25th international conference on Machine learning
Topological value iteration algorithm for Markov decision processes

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Affine algebraic decision diagrams (AADDs) and their application to structured probabilistic inference

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Solving POMDPs: RTDP-bel vs. point-based algorithms

IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Approximate dynamic programming with affine ADDs

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
SPUDD: stochastic planning using decision diagrams

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence

Online planning for large MDPs with MAXQ decomposition

Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 3
Learning high-level planning from text

ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers - Volume 1
Integrated task and motion planning in belief space

International Journal of Robotics Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper presents an algorithm for finding approximately optimal policies in very large Markov decision processes by constructing a hierarchical model and then solving it approximately. It exploits factored representations to achieve compactness and efficiency and to discover connectivity properties of the domain. We provide a bound on the quality of the solutions and give asymptotic analysis of the runtimes; in addition we demonstrate performance on a collection of very large domains. Results show that the quality of resulting policies is very good and the total running times, for both creating and solving the hierarchy, are significantly less than for an optimal factored MDP solver.