Toward error-bounded algorithms for infinite-horizon DEC-POMDPs

  • Authors:
  • Jilles S. Dibangoye;Fr Abdel-Illah Mouaddib;Brahim Chaib-draa

  • Affiliations:
  • Ecole des Mines de Douai Douai, France;University of Caen Caen, France;Laval University Québec, Qc, Canada

  • Venue:
  • The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 3
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Over the past few years, attempts to scale up infinite-horizon DECPOMDPs are mainly due to approximate algorithms, but without the theoretical guarantees of their exact counterparts. In contrast, ε-optimal methods have only theoretical significance but are not efficient in practice. In this paper, we introduce an algorithmic frame-work (β-pi) that exploits the scalability of the former while preserving the theoretical properties of the latter. We build upon β-pi a family of approximate algorithms that can find (provably) errorbounded solutions in reasonable time. Among this family, h-pi uses a branch-and-bound search method that computes a near-optimal solution over distributions over histories experienced by the agents. These distributions often lie near structured, low-dimensional subspace embedded in the high-dimensional sufficient statistic. By planning only on this subspace, h-pi successfully solves all tested benchmarks, outperforming standard algorithms, both in solution time and policy quality.