A hierarchical approach to efficient reinforcement learning in deterministic domains

Authors:
Carlos Diuk;Alexander L. Strehl;Michael L. Littman
Affiliations:
Rutgers University Piscataway, NJ;Rutgers University Piscataway, NJ;Rutgers University Piscataway, NJ
Venue:
AAMAS '06 Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
Year:
2006

Citing 11
Cited 2

Integrated architecture for learning, planning, and reacting based on approximating dynamic programming

Proceedings of the seventh international conference (1990) on Machine learning
Technical Note: \cal Q-Learning

Machine Learning
Prioritized Sweeping: Reinforcement Learning with Less Data and Less Time

Machine Learning
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Near-Optimal Reinforcement Learning in Polynomial Time

Machine Learning
Discovering Hierarchy in Reinforcement Learning with HEXQ

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Algorithm-Directed Exploration for Model-Based Reinforcement Learning in Factored MDPs

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Efficient Reinforcement Learning in Factored MDPs

IJCAI '99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence
Policy Iteration for Factored MDPs

UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
R-max - a general polynomial time algorithm for near-optimal reinforcement learning

The Journal of Machine Learning Research
Hierarchical reinforcement learning with the MAXQ value function decomposition

Journal of Artificial Intelligence Research

Hierarchical model-based reinforcement learning: R-max + MAXQ

Proceedings of the 25th international conference on Machine learning
Automatic discovery and transfer of MAXQ hierarchies

Proceedings of the 25th international conference on Machine learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

Factored representations, model-based learning, and hierarchies are well-studied techniques for improving the learning efficiency of reinforcement-learning algorithms in large-scale state spaces. We bring these three ideas together in a new algorithm. Our algorithm tackles two open problems from the reinforcement-learning literature, and provides a solution to those problems in deterministic domains. First, it shows how models can improve learning speed in the hierarchy-based MaxQ framework without disrupting opportunities for state abstraction. Second, we show how hierarchies can augment existing factored exploration algorithms to achieve not only low sample complexity for learning, but provably efficient planning as well. We illustrate the resulting performance gains in example domains. We prove polynomial bounds on the computational effort needed to attain near optimal performance within the hierarchy.