Constraint-based dynamic programming for decentralized POMDPs with structured interactions

Authors:
Akshat Kumar;Shlomo Zilberstein
Affiliations:
University of Massachusetts, Amherst, MA;University of Massachusetts, Amherst, MA
Venue:
Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Year:
2009

Citing 16
Cited 9

Computationally feasible bounds for partially observed Markov decision processes

Operations Research
Bucket elimination: a unifying framework for reasoning

Artificial Intelligence
The Complexity of Decentralized Control of Markov Decision Processes

Mathematics of Operations Research
Computing Factored Value Functions for Policies in Structured MDPs

IJCAI '99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence
Distributed Sensor Networks: A Multiagent Perspective

Distributed Sensor Networks: A Multiagent Perspective
Exploiting structure in decentralized markov decision processes

Exploiting structure in decentralized markov decision processes
Letting loose a SPIDER on a network of POMDPs: generating quality guaranteed policies

Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
Not all agents are equal: scaling up distributed POMDPs for agent networks

Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
Value-based observation compression for DEC-POMDPs

Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
Dynamic programming for partially observable stochastic games

AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Networked distributed POMDPs: a synthesis of distributed constraint optimization and POMDPs

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 1
Perseus: randomized point-based value iteration for POMDPs

Journal of Artificial Intelligence Research
Memory-bounded dynamic programming for DEC-POMDPs

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Forward search value iteration for POMDPs

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Point-based value iteration: an anytime algorithm for POMDPs

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
A scalable method for multiagent constraint optimization

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence

Event-detecting multi-agent MDPs: complexity and constant-factor approximation

IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Point-based backup for decentralized POMDPs: complexity and new algorithms

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Towards a unifying characterization for quantifying weak coupling in dec-POMDPs

The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Message-passing algorithms for large structured decentralized POMDPs

The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 3
Solving decentralized POMDP problems using genetic algorithms

Autonomous Agents and Multi-Agent Systems
Approximate solutions for factored Dec-POMDPs with many agents

Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
Incremental clustering and expansion for faster optimal planning in decentralized POMDPs

Journal of Artificial Intelligence Research
Optimally solving dec-POMDPs as continuous-state MDPs

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Automated generation of interaction graphs for value-factored dec-POMDPs

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence

Quantified Score

Hi-index	0.02

Visualization

Abstract

Decentralized partially observable MDPs (DEC-POMDPs) provide a rich framework for modeling decision making by a team of agents. Despite rapid progress in this area, the limited scalability of solution techniques has restricted the applicability of the model. To overcome this computational barrier, research has focused on restricted classes of DEC-POMDPs, which are easier to solve yet rich enough to capture many practical problems. We present CBDP, an efficient and scalable point-based dynamic programming algorithm for one such model called ND-POMDP (Network Distributed POMDP). Specifically, CBDP provides magnitudes of speedup in the policy computation and generates better quality solution for all test instances. It has linear complexity in the number of agents and horizon length. Furthermore, the complexity per horizon for the examined class of problems is exponential only in a small parameter that depends upon the interaction among the agents, achieving significant scalability for large, loosely coupled multi-agent systems. The efficiency of CBDP lies in exploiting the structure of interactions using constraint networks. These results extend significantly the effectiveness of decision-theoretic planning in multi-agent settings.