Distributed planning in hierarchical factored MDPs

Authors:
Carlos Guestrin;Geoffrey Gordon
Affiliations:
Computer Science Dept., Stanford University;Computer Science Dept., Carnegie Mellon University
Venue:
UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence
Year:
2002

Citing 12
Cited 13

A model for reasoning about persistence and causation

Computational Intelligence
Solving very large weakly coupled Markov decision processes

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Reinforcement learning with hierarchies of machines

NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
How to dynamically merge Markov decision processes

NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning

Artificial Intelligence
Markov Decision Processes: Discrete Stochastic Dynamic Programming

Markov Decision Processes: Discrete Stochastic Dynamic Programming
Learning to Drive a Bicycle Using Reinforcement Learning and Shaping

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Hierarchical reinforcement learning with the MAXQ value function decomposition

Journal of Artificial Intelligence Research
Computing factored value functions for policies in structured MDPs

IJCAI'99 Proceedings of the 16th international joint conference on Artificial intelligence - Volume 2
Decomposition techniques for planning in stochastic domains

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Hierarchical solution of Markov decision processes using macro-actions

UAI'98 Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence
Object-oriented Bayesian networks

UAI'97 Proceedings of the Thirteenth conference on Uncertainty in artificial intelligence

Coarticulation: an approach for generating concurrent plans in Markov decision processes

ICML '05 Proceedings of the 22nd international conference on Machine learning
Dimensions of complexity of intelligent agents

PCAR '06 Proceedings of the 2006 international symposium on Practical cognitive agents and robots
Agendas for multi-agent learning

Artificial Intelligence
Exploiting factored representations for decentralized execution in multiagent teams

Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
No-regret learning and a mechanism for distributed multiagent planning

Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
Decentralized control of cooperative systems: categorization and complexity analysis

Journal of Artificial Intelligence Research
Communication-based decomposition mechanisms for decentralized MDPs

Journal of Artificial Intelligence Research
Decision Support in Organizations: A Case for OrgPOMDPs

WI-IAT '11 Proceedings of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Volume 02
Social Model Shaping for Solving Generic DEC-POMDPs

WI-IAT '11 Proceedings of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Volume 02
Lazy auctions for multi-robot collision avoidance and motion control under uncertainty

AAMAS'11 Proceedings of the 10th international conference on Advanced Agent Technology
People, sensors, decisions: Customizable and adaptive technologies for assistance in healthcare

ACM Transactions on Interactive Intelligent Systems (TiiS) - Special issue on highlights of the decade in interactive intelligent systems
Distributed tree decomposition with privacy

CP'12 Proceedings of the 18th international conference on Principles and Practice of Constraint Programming
Lagrangian Relaxation for Large-Scale Multi-agent Planning

WI-IAT '12 Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 02

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a principled and efficient planning algorithm for collaborative multiagent dynamical systems. All computation, during both the planning and the execution phases, is distributed among the agents; each agent only needs to model and plan for a small part of the system. Each of these local subsystems is small, but once they are combined they can represent an exponentially larger problem. The subsystems are connected through a subsystem hierarchy. Coordination and communication between the agents is not imposed, but derived directly from the structure of this hierarchy. A globally consistent plan is achieved by a message passing algorithm, where messages correspond to natural local reward functions and are computed by local linear programs; another message passing algorithm allows us to execute the resulting policy. When two portions of the hierarchy share the same structure, our algorithm can reuse plans and messages to speed up computation.