Point-based incremental pruning heuristic for solving finite-horizon DEC-POMDPs

  • Authors:
  • Jilles S. Dibangoye;Abdel-Illah Mouaddib;Brahim Chai-draa

  • Affiliations:
  • University of Caen, Caen, France;University of Caen, Caen, France;Laval University, Québec, Qc, Canada

  • Venue:
  • Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Recent scaling up of decentralized partially observable Markov decision process (DEC-POMDP) solvers towards realistic applications is mainly due to approximate methods. Of this family, Memory Bounded Dynamic Programming (MBDP), which combines in a suitable manner top-down heuristics and bottom-up value function updates, can solve DEC-POMDPs with large horizons. The performances of MBDP, can be, however, drastically improved by avoiding the systematic generation and evaluation of all possible policies which result from the exhaustive backup. To achieve that, we suggest a heuristic search method, namely Point Based Incremental Pruning (PBIP), which is able to distinguish policies with different heuristic estimates. Taking this insight into account, PBIP searches only among the most promising policies, finds those useful, and prunes dominated ones. Doing so permits us to reduce clearly the amount of computation required by the exhaustive backup. The computation experiment shows that PBIP solves DEC-POMDP benchmarks up to 800 times faster than the current best approximate algorithms, while providing solutions with higher values.