An investigation into mathematical programming for finite horizon decentralized POMDPs

Authors:
Raghav Aras;Alain Dutech
Affiliations:
IMS, Suplec Metz, Metz, France;MAIA, LORIA, INRIA, Vandoeuvre les Nancy, France
Venue:
Journal of Artificial Intelligence Research
Year:
2010

Citing 37
Cited 6

Combinatorial optimization: algorithms and complexity

Combinatorial optimization: algorithms and complexity
The complexity of Markov decision processes

Mathematics of Operations Research
Practical methods of optimization; (2nd ed.)

Practical methods of optimization; (2nd ed.)
Multilinear programming: duality theories

Journal of Optimization Theory and Applications
Artificial intelligence: a modern approach

Artificial intelligence: a modern approach
Fast algorithms for finding randomized strategies in game trees

STOC '94 Proceedings of the twenty-sixth annual ACM symposium on Theory of computing
Acting optimally in partially observable stochastic domains

AAAI'94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 2)
Distributed rational decision making

Multiagent systems
A heuristic approach for solving decentralized-POMDP: assessment on the pursuit problem

Proceedings of the 2002 ACM symposium on Applied computing
Markov Decision Processes: Discrete Stochastic Dynamic Programming

Markov Decision Processes: Discrete Stochastic Dynamic Programming
Game Theory and Decision Theory in Multi-Agent Systems

Autonomous Agents and Multi-Agent Systems
The Complexity of Decentralized Control of Markov Decision Processes

Mathematics of Operations Research
Cooperative Co-Learning: A Model-Based Approach for Solving Multi Agent Reinforcement Problems

ICTAI '02 Proceedings of the 14th IEEE International Conference on Tools with Artificial Intelligence
Dynamic Programming

Dynamic Programming
Communication in Multi-Agent Markov Decision Processes

ICMAS '00 Proceedings of the Fourth International Conference on MultiAgent Systems (ICMAS-2000)
Learning and discovery of predictive state representations in dynamical systems with reset

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Learning to Communicate and Act Using Hierarchical Reinforcement Learning

AAMAS '04 Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems - Volume 3
Interac-DEC-MDP: Towards the Use of Interactions in DEC-MDP

AAMAS '04 Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems - Volume 3
Planning, learning and coordination in multiagent decision processes

TARK '96 Proceedings of the 6th conference on Theoretical aspects of rationality and knowledge
Mixed-integer linear programming for transition-independent decentralized MDPs

AAMAS '06 Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
Valid inequalities for mixed integer linear programs

Mathematical Programming: Series A and B
Shaping multi-agent systems with gradient reinforcement learning

Autonomous Agents and Multi-Agent Systems
Lossless clustering of histories in decentralized POMDPs

Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Dynamic programming for partially observable stochastic games

AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Point-based dynamic programming for DEC-POMDPs

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Mixed-integer programming methods for finding Nash equilibria

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
The communicative multiagent team decision problem: analyzing teamwork theories and models

Journal of Artificial Intelligence Research
Solving transition independent decentralized Markov decision processes

Journal of Artificial Intelligence Research
Optimal and approximate Q-value functions for decentralized POMDPs

Journal of Artificial Intelligence Research
Average-reward decentralized Markov decision processes

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Memory-bounded dynamic programming for DEC-POMDPs

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Solving POMDPs using quadratically constrained linear programs

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Taming decentralized POMDPs: towards efficient policy computation for multiagent settings

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
A bilinear programming approach for multiagent planning

Journal of Artificial Intelligence Research
Bounded policy iteration for decentralized POMDPs

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Planning and acting in partially observable stochastic domains

Artificial Intelligence
Introduction to Applied Optimization

Introduction to Applied Optimization

Toward error-bounded algorithms for infinite-horizon DEC-POMDPs

The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 3
Solving decentralized POMDP problems using genetic algorithms

Autonomous Agents and Multi-Agent Systems
Producing efficient error-bounded solutions for transition independent decentralized mdps

Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
Incremental clustering and expansion for faster optimal planning in decentralized POMDPs

Journal of Artificial Intelligence Research
Optimally solving dec-POMDPs as continuous-state MDPs

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Vector-Value Markov Decision Process for multi-objective stochastic path planning

International Journal of Hybrid Intelligent Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Decentralized planning in uncertain environments is a complex task generally dealt with by using a decision-theoretic approach, mainly through the framework of Decentralized Partially Observable Markov Decision Processes (DEC-POMDPs). Although DEC-POMDPS are a general and powerful modeling tool, solving them is a task with an overwhelming complexity that can be doubly exponential. In this paper, we study an alternate formulation of DEC-POMDPs relying on a sequence-form representation of policies. From this formulation, we show how to derive Mixed Integer Linear Programming (MILP) problems that, once solved, give exact optimal solutions to the DEC-POMDPs. We show that these MILPs can be derived either by using some combinatorial characteristics of the optimal solutions of the DEC-POMDPs or by using concepts borrowed from game theory. Through an experimental validation on classical test problems from the DEC-POMDP literature, we compare our approach to existing algorithms. Results show that mathematical programming outperforms dynamic programming but is less efficient than forward search, except for some particular problems. The main contributions of this work are the use of mathematical programming for DEC-POMDPs and a better understanding of DEC-POMDPs and of their solutions. Besides, we argue that our alternate representation of DEC-POMDPs could be helpful for designing novel algorithms looking for approximate solutions to DEC-POMDPs.