The permutable POMDP: fast solutions to POMDPs for preference elicitation

Authors:
Finale Doshi;Nicholas Roy
Affiliations:
CSAIL MIT, Cambridge, MA;CSAIL MIT, Cambridge, MA
Venue:
Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
Year:
2008

Citing 8
Cited 8

Near-Optimal Reinforcement Learning in Polynominal Time

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
A POMDP formulation of preference elicitation problems

Eighteenth national conference on Artificial intelligence
Heuristic search value iteration for POMDPs

UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
Spoken dialogue management using probabilistic reasoning

ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Efficient model learning for dialog management

Proceedings of the ACM/IEEE international conference on Human-robot interaction
Perseus: randomized point-based value iteration for POMDPs

Journal of Artificial Intelligence Research
Forward search value iteration for POMDPs

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Point-based value iteration: an anytime algorithm for POMDPs

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence

Multiattribute bayesian preference elicitation with pairwise comparison queries

ISNN'10 Proceedings of the 7th international conference on Advances in Neural Networks - Volume Part I
Exploiting symmetries for single- and multi-agent Partially Observable Stochastic Domains

Artificial Intelligence
The Skyline algorithm for POMDP value function pruning

Annals of Mathematics and Artificial Intelligence
Evaluating POMDP rewards for active perception

Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 3
Observer effect from stateful resources in agent sensing

Autonomous Agents and Multi-Agent Systems
A survey of point-based POMDP solvers

Autonomous Agents and Multi-Agent Systems
Decentralized multi-robot cooperation with auctioned POMDPs

International Journal of Robotics Research
MineralMiner: An active sensing simulation environment

Multiagent and Grid Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

The ability for an agent to reason under uncertainty is crucial for many planning applications, since an agent rarely has access to complete, error-free information about its environment. Partially Observable Markov Decision Processes (POMDPs) are a desirable framework in these planning domains because the resulting policies allow the agent to reason about its own uncertainty. In domains with hidden state and noisy observations, POMDPs optimally trade between actions that increase an agent's knowledge and actions that increase an agent's reward. Unfortunately, for many real world problems, even approximating good POMDP solutions is computationally intractable without leveraging structure in the problem domain. We show that the structure of many preference elicitation problems---in which the agent must discover some hidden preference or desire from another (usually human) agent---allows the POMDP solution to be solved with exponentially fewer belief points than standard point-based approximations while retaining the quality of the solution.