Value-directed sampling methods for monitoring POMDPs

  • Authors:
  • Pascal Poupart;Luis E. Ortiz;Craig Boutilier

  • Affiliations:
  • Department of Computer Science, University of Toronto, Toronto, ON;Computer Science Department, Brown University, Providence, RI;Department of Computer Science, University of Toronto, Toronto, ON

  • Venue:
  • UAI'01 Proceedings of the Seventeenth conference on Uncertainty in artificial intelligence
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

We consider the problem of approximate belief-state monitoring using particle filtering for the purposes of implementing a policy for a partially observable Markov decision process (POMDP). While particle illtering has become a widely used tool in AI for monitoring dynamical systems, rather scant attention has been paid to their use in the context of decision making. Assuming the existence of a value function, we derive error bounds on decision quality associated with filtering using importance sampling. We also describe an adaptive procedure that can be used to dynamically determine the number of samples required to meet specific error bounds, Empirical evidence is offered supporting this technique as a profitable means of directing sampiing effort where it is needed to distinguish policies.