Point-based incremental pruning heuristic for solving finite-horizon DEC-POMDPs

Authors:
Jilles S. Dibangoye;Abdel-Illah Mouaddib;Brahim Chai-draa
Affiliations:
University of Caen, Caen, France;University of Caen, Caen, France;Laval University, Québec, Qc, Canada
Venue:
Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Year:
2009

Citing 11
Cited 9

General branch and bound, and its relation to A* and AO*

Artificial Intelligence
The Complexity of Decentralized Control of Markov Decision Processes

Mathematics of Operations Research
The complexity of multiagent systems: the price of silence

AAMAS '03 Proceedings of the second international joint conference on Autonomous agents and multiagent systems
Dynamic Programming

Dynamic Programming
Letting loose a SPIDER on a network of POMDPs: generating quality guaranteed policies

Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
Value-based observation compression for DEC-POMDPs

Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
Dynamic programming for partially observable stochastic games

AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Point-based dynamic programming for DEC-POMDPs

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Networked distributed POMDPs: a synthesis of distributed constraint optimization and POMDPs

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 1
Memory-bounded dynamic programming for DEC-POMDPs

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Point-based value iteration: an anytime algorithm for POMDPs

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence

Heuristic search for identical payoff Bayesian games

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Point-based policy generation for decentralized POMDPs

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Point-based backup for decentralized POMDPs: complexity and new algorithms

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Online planning for multi-agent systems with bounded communication

Artificial Intelligence
Toward error-bounded algorithms for infinite-horizon DEC-POMDPs

The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 3
Social Model Shaping for Solving Generic DEC-POMDPs

WI-IAT '11 Proceedings of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Volume 02
Solving decentralized POMDP problems using genetic algorithms

Autonomous Agents and Multi-Agent Systems
Producing efficient error-bounded solutions for transition independent decentralized mdps

Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
Incremental clustering and expansion for faster optimal planning in decentralized POMDPs

Journal of Artificial Intelligence Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recent scaling up of decentralized partially observable Markov decision process (DEC-POMDP) solvers towards realistic applications is mainly due to approximate methods. Of this family, Memory Bounded Dynamic Programming (MBDP), which combines in a suitable manner top-down heuristics and bottom-up value function updates, can solve DEC-POMDPs with large horizons. The performances of MBDP, can be, however, drastically improved by avoiding the systematic generation and evaluation of all possible policies which result from the exhaustive backup. To achieve that, we suggest a heuristic search method, namely Point Based Incremental Pruning (PBIP), which is able to distinguish policies with different heuristic estimates. Taking this insight into account, PBIP searches only among the most promising policies, finds those useful, and prunes dominated ones. Doing so permits us to reduce clearly the amount of computation required by the exhaustive backup. The computation experiment shows that PBIP solves DEC-POMDP benchmarks up to 800 times faster than the current best approximate algorithms, while providing solutions with higher values.