The permutable POMDP: fast solutions to POMDPs for preference elicitation

  • Authors:
  • Finale Doshi;Nicholas Roy

  • Affiliations:
  • CSAIL MIT, Cambridge, MA;CSAIL MIT, Cambridge, MA

  • Venue:
  • Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

The ability for an agent to reason under uncertainty is crucial for many planning applications, since an agent rarely has access to complete, error-free information about its environment. Partially Observable Markov Decision Processes (POMDPs) are a desirable framework in these planning domains because the resulting policies allow the agent to reason about its own uncertainty. In domains with hidden state and noisy observations, POMDPs optimally trade between actions that increase an agent's knowledge and actions that increase an agent's reward. Unfortunately, for many real world problems, even approximating good POMDP solutions is computationally intractable without leveraging structure in the problem domain. We show that the structure of many preference elicitation problems---in which the agent must discover some hidden preference or desire from another (usually human) agent---allows the POMDP solution to be solved with exponentially fewer belief points than standard point-based approximations while retaining the quality of the solution.