Near-Optimal Reinforcement Learning in Polynominal Time
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
A POMDP formulation of preference elicitation problems
Eighteenth national conference on Artificial intelligence
Heuristic search value iteration for POMDPs
UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
Spoken dialogue management using probabilistic reasoning
ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
Efficient model learning for dialog management
Proceedings of the ACM/IEEE international conference on Human-robot interaction
Perseus: randomized point-based value iteration for POMDPs
Journal of Artificial Intelligence Research
Forward search value iteration for POMDPs
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Point-based value iteration: an anytime algorithm for POMDPs
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Multiattribute bayesian preference elicitation with pairwise comparison queries
ISNN'10 Proceedings of the 7th international conference on Advances in Neural Networks - Volume Part I
Exploiting symmetries for single- and multi-agent Partially Observable Stochastic Domains
Artificial Intelligence
The Skyline algorithm for POMDP value function pruning
Annals of Mathematics and Artificial Intelligence
Evaluating POMDP rewards for active perception
Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 3
Observer effect from stateful resources in agent sensing
Autonomous Agents and Multi-Agent Systems
A survey of point-based POMDP solvers
Autonomous Agents and Multi-Agent Systems
Decentralized multi-robot cooperation with auctioned POMDPs
International Journal of Robotics Research
MineralMiner: An active sensing simulation environment
Multiagent and Grid Systems
Hi-index | 0.00 |
The ability for an agent to reason under uncertainty is crucial for many planning applications, since an agent rarely has access to complete, error-free information about its environment. Partially Observable Markov Decision Processes (POMDPs) are a desirable framework in these planning domains because the resulting policies allow the agent to reason about its own uncertainty. In domains with hidden state and noisy observations, POMDPs optimally trade between actions that increase an agent's knowledge and actions that increase an agent's reward. Unfortunately, for many real world problems, even approximating good POMDP solutions is computationally intractable without leveraging structure in the problem domain. We show that the structure of many preference elicitation problems---in which the agent must discover some hidden preference or desire from another (usually human) agent---allows the POMDP solution to be solved with exponentially fewer belief points than standard point-based approximations while retaining the quality of the solution.