Solving POMDPs: RTDP-bel vs. point-based algorithms

  • Authors:
  • Blai Bonet;Héctor Geffner

  • Affiliations:
  • Departamento de Computación, Universidad Simón Bolívar, Caracas, Venezuela;Departamento de Tecnología, ICREA & Universitat Pompeu Fabra, Barcelona, Spain

  • Venue:
  • IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Point-based algorithms and RTDP-Bel are approximate methods for solving POMDPs that replace the full updates of parallel value iteration by faster and more effective updates at selected beliefs. An important difference between the two methods is that the former adopt Sondik's representation of the value function, while the latter uses a tabular representation and a discretization function. The algorithms, however, have not been compared up to now, because they target different POMDPs: discounted POMDPs on the one hand, and Goal POMDPs on the other. In this paper, we bridge this representational gap, showing how to transform discounted POMDPs into Goal POMDPs, and use the transformation to compare RTDP-Bel with point-based algorithms over the existing discounted benchmarks. The results appear to contradict the conventional wisdom in the area showing that RTDP-Bel is competitive, and sometimes superior to point-based algorithms in both quality and time.