Scaling up optimal heuristic search in Dec-POMDPs via incremental expansion

Authors:
Matthijs T. J. Spaan;Frans A. Oliehoek;Christopher Amato
Affiliations:
Inst. for Systems and Robotics, Instituto Superior Técnico, Lisbon, Portugal;CSAIL, Massachusetts Inst. of Technology, Cambridge, MA;Aptima, Inc., Woburn, MA
Venue:
IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
Year:
2011

Citing 10
Cited 11

The complexity of multiagent systems: the price of silence

AAMAS '03 Proceedings of the second international joint conference on Autonomous agents and multiagent systems
Approximate Solutions for Partially Observable Stochastic Games with Common Payoffs

AAMAS '04 Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems - Volume 1
Formal models and algorithms for decentralized decision making under uncertainty

Autonomous Agents and Multi-Agent Systems
Lossless clustering of histories in decentralized POMDPs

Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Agent influence as a predictor of difficulty for decentralized problem-solving

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
Optimal and approximate Q-value functions for decentralized POMDPs

Journal of Artificial Intelligence Research
Memory-bounded dynamic programming for DEC-POMDPs

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Heuristic search for identical payoff Bayesian games

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Point-based backup for decentralized POMDPs: complexity and new algorithms

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Online planning for multi-agent systems with bounded communication

Artificial Intelligence

Heuristic search of multiagent influence space

Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
QueryPOMDP: POMDP-based communication in multiagent systems

EUMAS'11 Proceedings of the 9th European conference on Multi-Agent Systems
Solving decentralized POMDP problems using genetic algorithms

Autonomous Agents and Multi-Agent Systems
Producing efficient error-bounded solutions for transition independent decentralized mdps

Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
Approximate solutions for factored Dec-POMDPs with many agents

Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
Concurrent reinforcement learning as a rehearsal for decentralized planning under uncertainty

Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
Reinforcement learning for decentralized planning under uncertainty

Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
Incremental clustering and expansion for faster optimal planning in decentralized POMDPs

Journal of Artificial Intelligence Research
Optimally solving dec-POMDPs as continuous-state MDPs

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Monte-Carlo expectation maximization for decentralized POMDPs

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Interactive POMDP lite: towards practical planning to predict and exploit intentions for interacting with self-interested agents

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Planning under uncertainty for multiagent systems can be formalized as a decentralized partially observable Markov decision process. We advance the state of the art for optimal solution of this model, building on the Multiagent A* heuristic search method. A key insight is that we can avoid the full expansion of a search node that generates a number of children that is doubly exponential in the node's depth. Instead, we incrementally expand the children only when a next child might have the highest heuristic value. We target a subsequent bottleneck by introducing a more memory-efficient representation for our heuristic functions. Proof is given that the resulting algorithm is correct and experiments demonstrate a significant speedup over the state of the art, allowing for optimal solutions over longer horizons for many benchmark problems.