An optimal best-first search algorithm for solving infinite horizon DEC-POMDPs

Authors:
Daniel Szer;François Charpillet
Affiliations:
MAIA Group, INRIA Lorraine – LORIA, Vandœuvre-lès-Nancy, France;MAIA Group, INRIA Lorraine – LORIA, Vandœuvre-lès-Nancy, France
Venue:
ECML'05 Proceedings of the 16th European conference on Machine Learning
Year:
2005

Citing 14
Cited 10

An improved policy iteration algorithm for partially observable MDPs

NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
Decentralized supply chains subject to information delays

Management Science
A heuristic approach for solving decentralized-POMDP: assessment on the pursuit problem

Proceedings of the 2002 ACM symposium on Applied computing
Markov Decision Processes: Discrete Stochastic Dynamic Programming

Markov Decision Processes: Discrete Stochastic Dynamic Programming
The Complexity of Decentralized Control of Markov Decision Processes

Mathematics of Operations Research
Dynamic programming for partially observable stochastic games

AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
The communicative multiagent team decision problem: analyzing teamwork theories and models

Journal of Artificial Intelligence Research
Solving transition independent decentralized Markov decision processes

Journal of Artificial Intelligence Research
Sequential optimality and coordination in multiagent systems

IJCAI'99 Proceedings of the 16th international joint conference on Artifical intelligence - Volume 1
Taming decentralized POMDPs: towards efficient policy computation for multiagent settings

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Bounded policy iteration for decentralized POMDPs

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Planning and acting in partially observable stochastic domains

Artificial Intelligence
Solving POMDPs by searching the space of finite policies

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Learning to cooperate via policy search

UAI'00 Proceedings of the Sixteenth conference on Uncertainty in artificial intelligence

Decentralized planning under uncertainty for teams of communicating agents

AAMAS '06 Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
Achieving goals in decentralized POMDPs

Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Point-based dynamic programming for DEC-POMDPs

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Optimal and approximate Q-value functions for decentralized POMDPs

Journal of Artificial Intelligence Research
Policy iteration for decentralized control of Markov decision processes

Journal of Artificial Intelligence Research
Introducing Communication in Dis-POMDPs with Finite State Machines

WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 02
Optimizing fixed-size stochastic controllers for POMDPs and decentralized POMDPs

Autonomous Agents and Multi-Agent Systems
Toward error-bounded algorithms for infinite-horizon DEC-POMDPs

The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 3
Efficient planning for factored infinite-horizon DEC-POMDPs

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume One
Solving decentralized POMDP problems using genetic algorithms

Autonomous Agents and Multi-Agent Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

In the domain of decentralized Markov decision processes, we develop the first complete and optimal algorithm that is able to extract deterministic policy vectors based on finite state controllers for a cooperative team of agents. Our algorithm applies to the discounted infinite horizon case and extends best-first search methods to the domain of decentralized control theory. We prove the optimality of our approach and give some first experimental results for two small test problems. We believe this to be an important step forward in learning and planning in stochastic multi-agent systems.