Hierarchical solution of Markov decision processes using macro-actions

Authors:
Milos Hauskrecht;Nicolas Meuleau;Leslie Pack Kaelbling;Thomas Dean;Craig Boutilier
Affiliations:
Computer Science Department, Brown University, Providence, RI;Computer Science Department, Brown University, Providence, RI;Computer Science Department, Brown University, Providence, RI;Computer Science Department, Brown University, Providence, RI;Department of Computer Science, University of British Columbia, Vancouver, BC, Canada
Venue:
UAI'98 Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence
Year:
1998

Citing 13
Cited 35

Macro-operators: a weak method for learning

Artificial Intelligence - Lecture notes in computer science 178
SOAR: an architecture for general intelligence

Artificial Intelligence
Planning under time constraints in stochastic domains

Artificial Intelligence - Special volume on planning and scheduling
Abstraction and approximate decision-theoretic planning

Artificial Intelligence
Reinforcement learning with hierarchies of machines

NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
Multi-time models for temporally abstract planning

NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
Markov Decision Processes: Discrete Stochastic Dynamic Programming

Markov Decision Processes: Discrete Stochastic Dynamic Programming
Neuro-Dynamic Programming

Neuro-Dynamic Programming
Theoretical Results on Reinforcement Learning with Temporally Abstract Options

ECML '98 Proceedings of the 10th European Conference on Machine Learning
Selectively generalizing plans for problem-solving

IJCAI'85 Proceedings of the 9th international joint conference on Artificial intelligence - Volume 1
Exploiting structure in policy construction

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Decomposition techniques for planning in stochastic domains

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Model minimization in Markov decision processes

AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence

A new decomposition technique for solving Markov decision processes

Proceedings of the 2001 ACM symposium on Applied computing
Structure in the Space of Value Functions

Machine Learning
Decision-Theoretic Control of Planetary Rovers

Revised Papers from the International Seminar on Advances in Plan-Based Control of Robotic Agents,
Nearly deterministic abstractions of Markov decision processes

Eighteenth national conference on Artificial intelligence
Mobile Robotics Planning Using Abstract Markov Decision Processes

ICTAI '99 Proceedings of the 11th IEEE International Conference on Tools with Artificial Intelligence
Automated resource-driven mission phasing techniques for constrained agents

Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems
Probabilistic inference for solving discrete and continuous state Markov Decision Processes

ICML '06 Proceedings of the 23rd international conference on Machine learning
Predictive state representations with options

ICML '06 Proceedings of the 23rd international conference on Machine learning
Causal Graph Based Decomposition of Factored MDPs

The Journal of Machine Learning Research
Automatic shaping and decomposition of reward functions

Proceedings of the 24th international conference on Machine learning
The utility of temporal abstraction in reinforcement learning

Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
Options in Readylog Reloaded --- Generating Decision-Theoretic Plan Libraries in Golog

KI '07 Proceedings of the 30th annual German conference on Advances in Artificial Intelligence
Logic-based robot control in highly dynamic domains

Robotics and Autonomous Systems
Improving action selection in MDP's via knowledge transfer

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Hierarchical reinforcement learning with the MAXQ value function decomposition

Journal of Artificial Intelligence Research
Accelerating reinforcement learning by composing solutions of automatically identified subtasks

Journal of Artificial Intelligence Research
Existence of multiagent equilibria with limited agents

Journal of Artificial Intelligence Research
Policy recognition in the abstract hidden Markov model

Journal of Artificial Intelligence Research
Bounding the suboptimality of reusing subproblems

IJCAI'99 Proceedings of the 16th international joint conference on Artificial intelligence - Volume 2
Modular self-organization for a long-living autonomous agent

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Planning-space shift learning: variable-space motion planning toward flexible extension of body schema

IROS'09 Proceedings of the 2009 IEEE/RSJ international conference on Intelligent robots and systems
Planning-Space Shift Motion Generation: Variable-space Motion Planning Toward Flexible Extension of Body Schema

Journal of Intelligent and Robotic Systems
Reinforcement learning control with adaptive gain for a Saccharomyces cerevisiae fermentation process

Applied Soft Computing
Distributed planning in hierarchical factored MDPs

UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence
Robust combination of local controllers

UAI'01 Proceedings of the Seventeenth conference on Uncertainty in artificial intelligence
A clustering approach to solving large stochastic matching problems

UAI'01 Proceedings of the Seventeenth conference on Uncertainty in artificial intelligence
Flexible decomposition algorithms for weakly coupled Markov decision problems

UAI'98 Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence
Multi-policy dialogue management

SIGDIAL '11 Proceedings of the SIGDIAL 2011 Conference
Desire-space analysis and action selection for multiple dynamic goals

CLIMA'04 Proceedings of the 5th international conference on Computational Logic in Multi-Agent Systems
The eMOSAIC model for humanoid robot control

Neural Networks
Topological value iteration algorithms

Journal of Artificial Intelligence Research
Proximity-based non-uniform abstractions for approximate planning

Journal of Artificial Intelligence Research
Probabilistic dialogue models with prior domain knowledge

SIGDIAL '12 Proceedings of the 13th Annual Meeting of the Special Interest Group on Discourse and Dialogue
Tractable POMDP representations for intelligent tutoring systems

ACM Transactions on Intelligent Systems and Technology (TIST) - Special section on agent communication, trust in multiagent systems, intelligent tutoring and coaching systems
Robotics and artificial intelligence: A perspective on deliberation functions

AI Communications - ECAI 2012 Turing and Anniversary Track

Quantified Score

Hi-index	0.00

Visualization

Abstract

We investigate the use of temporally abstract actions, or macro-actions, in the solution of Markov decision processes. Unlike current models that combine both primitive actions and macro-actions and leave the state space unchanged, we propose a hierarchical model (using an abstract MDP) that works with macro-actions only, and that significantly reduces the size of the state space. This is achieved by treating macroactions as local policies that act in certain regions of state space, and by restricting states in the abstract MDP to those at the boundaries of regions. The abstract MDP approximates the original and can be solved more efficiently. We discuss several ways in which macro-actions can be generated to ensure good solution quality. Finally, we consider ways in which macro-actions can be reused to solve multiple, related MDPs; and we show that this can justify the computational overhead of macro-action generation.