Bimodal switching for online planning in multiagent settings

Authors:
Ekhlas Sonu;Prashant Doshi
Affiliations:
THINC Lab, Department of Computer Science, University of Georgia, Athens, GA;THINC Lab, Department of Computer Science, University of Georgia, Athens, GA
Venue:
IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Year:
2013

Citing 7
Cited 0

Heuristic search value iteration for POMDPs

UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
A framework for sequential planning in multi-agent settings

Journal of Artificial Intelligence Research
Online planning algorithms for POMDPs

Journal of Artificial Intelligence Research
Monte Carlo sampling methods for approximating interactive POMDPs

Journal of Artificial Intelligence Research
Planning and acting in partially observable stochastic domains

Artificial Intelligence
Online planning for multi-agent systems with bounded communication

Artificial Intelligence
Generalized and bounded policy iteration for finitely-nested interactive POMDPs: scaling up

Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 2

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a bimodal method for online planning in partially observable multiagent settings as formalized by a finitely-nested interactive partially observable Markov decision process (I-POMDP). An agent planning in an environment shared with another updates beliefs both over the physical state and the other agents' models. In problems where we do not observe other's action explicitly but must infer it from sensing its effect on the state, observations are more informative about the other when the belief over the state space has reduced uncertainty. For typical, uncertain initial beliefs, we model the agent as if it were acting alone and utilize fast online planning for POMDPs. Subsequently, the agent switches to online planning in multiagent settings. We maintain tight lower and upper bounds at each step, and switch over when the difference between them reduces to less than ε.