An Overview of MAXQ Hierarchical Reinforcement Learning
SARA '02 Proceedings of the 4th International Symposium on Abstraction, Reformulation, and Approximation
Assisting elders via dynamic multi-tasks planning: a Markov decision processes based approach
Proceedings of the 1st international conference on Ambient media and systems
Hierarchical reinforcement learning with the MAXQ value function decomposition
Journal of Artificial Intelligence Research
A model approximation scheme for planning in partially observable stochastic domains
Journal of Artificial Intelligence Research
Decomposition techniques for planning in stochastic domains
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Generating hierarchical structure in reinforcement learning from state variables
PRICAI'00 Proceedings of the 6th Pacific Rim international conference on Artificial intelligence
Fast value iteration for goal-directed Markov decision processes
UAI'97 Proceedings of the Thirteenth conference on Uncertainty in artificial intelligence
Hi-index | 0.00 |
This paper is concerned with modeling planning problems involving uncertainty as discrete-time, finite-state stochastic automata. Solving planning problems is reduced to computing policies for Markov decision processes. Classical methods for solving Markov decision processes cannot cope with the size of the state spaces for typical problems encountered in practice. As an alternative, we investigate methods that decompose global planning problems into a number of local problems, solve the local problems separately, and then combine the local solutions to generate a global solution. We present algorithms that decompose planning problems into smaller problems given an arbitrary partition of the state space. The local problems are interpreted as Markov decision processes and solutions to the local problems are interpreted as policies restricted to the subsets of the state space defined by the partition. One algorithm relies on constructing and solving an abstract version of the original decision problem. A second algorithm iteratively approximates parameters of the local problems to convrge to an optimal solution. We show how properties of the specified partition impact on the time and storage required for these algorithms.