Scalable multiagent planning using probabilistic inference

Authors:
Akshat Kumar;Shlomo Zilberstein;Marc Toussaint
Affiliations:
Dept. of Computer Science, Univ. of Massachusetts Amherst;Dept. of Computer Science, Univ. of Massachusetts Amherst;Dept. of Mathematics and Computer Science, FU Berlin
Venue:
IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
Year:
2011

Citing 14
Cited 6

SNOPT: An SQP Algorithm for Large-Scale Constrained Optimization

SIAM Journal on Optimization
The Complexity of Decentralized Control of Markov Decision Processes

Mathematics of Operations Research
Distributed Sensor Networks: A Multiagent Perspective

Distributed Sensor Networks: A Multiagent Perspective
Probabilistic inference for solving discrete and continuous state Markov Decision Processes

ICML '06 Proceedings of the 23rd international conference on Machine learning
Not all agents are equal: scaling up distributed POMDPs for agent networks

Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
Networked distributed POMDPs: a synthesis of distributed constraint optimization and POMDPs

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 1
Solving transition independent decentralized Markov decision processes

Journal of Artificial Intelligence Research
A framework for sequential planning in multi-agent settings

Journal of Artificial Intelligence Research
Optimal and approximate Q-value functions for decentralized POMDPs

Journal of Artificial Intelligence Research
Sequential optimality and coordination in multiagent systems

IJCAI'99 Proceedings of the 16th international joint conference on Artifical intelligence - Volume 1
Computing factored value functions for policies in structured MDPs

IJCAI'99 Proceedings of the 16th international joint conference on Artificial intelligence - Volume 2
Taming decentralized POMDPs: towards efficient policy computation for multiagent settings

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Event-detecting multi-agent MDPs: complexity and constant-factor approximation

IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Optimizing fixed-size stochastic controllers for POMDPs and decentralized POMDPs

Autonomous Agents and Multi-Agent Systems

Heuristic search of multiagent influence space

Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
Producing efficient error-bounded solutions for transition independent decentralized mdps

Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
Approximate solutions for factored Dec-POMDPs with many agents

Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
Coordinating multi-agent reinforcement learning with limited communication

Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
Monte-Carlo expectation maximization for decentralized POMDPs

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Automated generation of interaction graphs for value-factored dec-POMDPs

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Multiagent planning has seen much progress with the development of formal models such as Dec-POMDPs. However, the complexity of these models--NEXP-Complete even for two agents-- has limited scalability. We identify certain mild conditions that are sufficient to make multiagent planning amenable to a scalable approximation w.r.t. the number of agents. This is achieved by constructing a graphical model in which likelihood maximization is equivalent to plan optimization. Using the Expectation-Maximization framework for likelihood maximization, we show that the necessary inference can be decomposed into processes that often involve a small subset of agents, thereby facilitating scalability. We derive a global update rule that combines these local inferences to monotonically increase the overall solution quality. Experiments on a large multiagent planning benchmark confirm the benefits of the new approach in terms of runtime and scalability.