Monte Carlo sampling methods for approximating interactive POMDPs

Authors:
Prashant Doshi;Piotr J. Gmytrasiewicz
Affiliations:
Department of Computer Science, University of Georgia, Athens, GA;Department of Computer Science, University of Illinois at Chicago, Chicago, IL
Venue:
Journal of Artificial Intelligence Research
Year:
2009

Citing 29
Cited 7

Recursive Bayesian estimation using piece-wise constant approximations

Automatica (Journal of IFAC)
Artificial intelligence: a modern approach

Artificial intelligence: a modern approach
Reasoning about knowledge

Reasoning about knowledge
Chernoff-Hoeffding Bounds for Applications with Limited Independence

SIAM Journal on Discrete Mathematics
Feature-based methods for large scale dynamic programming

Machine Learning - Special issue on reinforcement learning
An introduction to Kolmogorov complexity and its applications (2nd ed.)

An introduction to Kolmogorov complexity and its applications (2nd ed.)
Planning and acting in partially observable stochastic domains

Artificial Intelligence
Dynamic Programming and Optimal Control

Dynamic Programming and Optimal Control
A Probabilistic Approach to Collaborative Multi-Robot Localization

Autonomous Robots
A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes

Machine Learning
The Complexity of Decentralized Control of Markov Decision Processes

Mathematics of Operations Research
Rao-Blackwellised Particle Filtering for Dynamic Bayesian Networks

UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
Value-Directed Sampling Methods for POMDPs

UAI '01 Proceedings of the 17th Conference in Uncertainty in Artificial Intelligence
Sampling Methods for Action Selection in Influence Diagrams

Proceedings of the Seventeenth National Conference on Artificial Intelligence and Twelfth Conference on Innovative Applications of Artificial Intelligence
Iterative Methods for Sparse Linear Systems

Iterative Methods for Sparse Linear Systems
Approximating state estimation in multiagent settings using particle filters

Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems
An online POMDP algorithm for complex multiagent environments

Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems
Bayesian sparse sampling for on-line reward optimization

ICML '05 Proceedings of the 22nd international conference on Machine learning
Graphical models for online solutions to interactive POMDPs

Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
Formal models and algorithms for decentralized decision making under uncertainty

Autonomous Agents and Multi-Agent Systems
A particle filtering based approach to approximating interactive POMDPs

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Improved state estimation in multiagent settings with continuous or large discrete state spaces

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
Generalized point based value iteration for interactive POMDPs

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 1
Finding approximate POMDP solutions through belief compression

Journal of Artificial Intelligence Research
A framework for sequential planning in multi-agent settings

Journal of Artificial Intelligence Research
Anytime point-based approximations for large POMDPs

Journal of Artificial Intelligence Research
Online planning algorithms for POMDPs

Journal of Artificial Intelligence Research
Memory-bounded dynamic programming for DEC-POMDPs

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
A survey of convergence results on particle filtering methods forpractitioners

IEEE Transactions on Signal Processing

A PGM framework for recursive modeling of players in simple sequential Bayesian games

International Journal of Approximate Reasoning
A partition-based first-order probabilistic logic to represent interactive beliefs

SUM'11 Proceedings of the 5th international conference on Scalable uncertainty management
Generalized and bounded policy iteration for finitely-nested interactive POMDPs: scaling up

Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
Bayesian interaction shaping: learning to influence strategic interactions in mixed robotic domains

Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
Incremental clustering and expansion for faster optimal planning in decentralized POMDPs

Journal of Artificial Intelligence Research
Bimodal switching for online planning in multiagent settings

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Interactive POMDP lite: towards practical planning to predict and exploit intentions for interacting with self-interested agents

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Partially observable Markov decision processes (POMDPs) provide a principled framework for sequential planning in uncertain single agent settings. An extension of POMDPs to multiagent settings, called interactive POMDPs (I-POMDPs), replaces POMDP belief spaces with interactive hierarchical belief systems which represent an agent's belief about the physical world, about beliefs of other agents, and about their beliefs about others' beliefs. This modification makes the difficulties of obtaining solutions due to complexity of the belief and policy spaces even more acute. We describe a general method for obtaining approximate solutions of I-POMDPs based on particle filtering (PF). We introduce the interactive PF, which descends the levels of the interactive belief hierarchies and samples and propagates beliefs at each level. The interactive PF is able to mitigate the belief space complexity, but it does not address the policy space complexity. To mitigate the policy space complexity - sometimes also called the curse of history - we utilize a complementary method based on sampling likely observations while building the look ahead reachability tree. While this approach does not completely address the curse of history, it beats back the curse's impact substantially. We provide experimental results and chart future work.