Winning back the CUP for distributed POMDPs: planning over continuous belief spaces

Authors:
Pradeep Varakantham;Ranjit Nair;Milind Tambe;Makoto Yokoo
Affiliations:
University of Southern California, Los Angeles, CA;University of Southern California, Los Angeles, CA and Automation and Control Solutions, Honeywell Laboratories, Minneapolis, MN;University of Southern California, Los Angeles, CA;University of Southern California, Los Angeles, CA and Kyushu University, Fukuoka, Japan
Venue:
AAMAS '06 Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
Year:
2006

Citing 15
Cited 4

Planning and acting in partially observable stochastic domains

Artificial Intelligence
Communication decisions in multi-agent cooperation: model and experiments

Proceedings of the fifth international conference on Autonomous agents
Learning to Cooperate via Policy Search

UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
The Complexity of Decentralized Control of Markov Decision Processes

UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
Transition-independent decentralized markov decision processes

AAMAS '03 Proceedings of the second international joint conference on Autonomous agents and multiagent systems
Optimizing information exchange in cooperative multi-agent systems

AAMAS '03 Proceedings of the second international joint conference on Autonomous agents and multiagent systems
Role allocation and reallocation in multiagent teams: towards a practical analysis

AAMAS '03 Proceedings of the second international joint conference on Autonomous agents and multiagent systems
Communication for Improving Policy Computation in Distributed POMDPs

AAMAS '04 Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems - Volume 3
Region-based incremental pruning for POMDPs

UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
Exploiting belief bounds: practical POMDPs for personal assistant agents

Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems
Dynamic programming for partially observable stochastic games

AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Networked distributed POMDPs: a synthesis of distributed constraint optimization and POMDPs

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 1
The communicative multiagent team decision problem: analyzing teamwork theories and models

Journal of Artificial Intelligence Research
Bounded policy iteration for decentralized POMDPs

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Incremental pruning: a simple, fast, exact method for partially observable Markov decision processes

UAI'97 Proceedings of the Thirteenth conference on Uncertainty in artificial intelligence

Point-based dynamic programming for DEC-POMDPs

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Towards faster planning with continuous resources in stochastic domains

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
Optimal and approximate Q-value functions for decentralized POMDPs

Journal of Artificial Intelligence Research
Incremental clustering and expansion for faster optimal planning in decentralized POMDPs

Journal of Artificial Intelligence Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

Distributed Partially Observable Markov Decision Problems (Distributed POMDPs) are evolving as a popular approach for modeling multiagent systems, and many different algorithms have been proposed to obtain locally or globally optimal policies. Unfortunately, most of these algorithms have either been explicitly designed or experimentally evaluated assuming knowledge of a starting belief point, an assumption that often does not hold in complex, uncertain domains. Instead, in such domains, it is important for agents to explicitly plan over continuous belief spaces. This paper provides a novel algorithm to explicitly compute finite horizon policies over continuous belief spaces, without restricting the space of policies. By marrying an efficient single-agent POMDP solver with a heuristic distributed POMDP policy-generation algorithm, locally optimal joint policies are obtained, each of which dominates within a different part of the belief region. We provide heuristics that significantly improve the efficiency of the resulting algorithm and provide detailed experimental results. To the best of our knowledge, these are the first run-time results for analytically generating policies over continuous belief spaces in distributed POMDPs.