Optimal and approximate Q-value functions for decentralized POMDPs

Authors:
Frans A. Oliehoek;Matthijs T. J. Spaan;Nikos Vlassis
Affiliations:
Intelligent Systems Lab Amsterdam, University of Amsterdam, Amsterdam, The Netherlands;Institute for Systems and Robotics, Instituto Superior Técnico, Lisbon, Portugal;Department of Production Engineering and Management, Technical University of Crete, Chania, Greece
Venue:
Journal of Artificial Intelligence Research
Year:
2008

Citing 55
Cited 20

The complexity of Markov decision processes

Mathematics of Operations Research
Fast algorithms for finding randomized strategies in game trees

STOC '94 Proceedings of the twenty-sixth annual ACM symposium on Theory of computing
Representations and solutions for game-theoretic problems

Artificial Intelligence - Special issue on economic principles of multi-agent systems
RoboCup: The Robot World Cup Initiative

AGENTS '97 Proceedings of the first international conference on Autonomous agents
Planning and acting in partially observable stochastic domains

Artificial Intelligence
Communication decisions in multi-agent cooperation: model and experiments

Proceedings of the fifth international conference on Autonomous agents
A heuristic approach for solving decentralized-POMDP: assessment on the pursuit problem

Proceedings of the 2002 ACM symposium on Applied computing
Markov Decision Processes: Discrete Stochastic Dynamic Programming

Markov Decision Processes: Discrete Stochastic Dynamic Programming
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
The Complexity of Decentralized Control of Markov Decision Processes

Mathematics of Operations Research
A Multi-Agent Policy-Gradient Approach to Network Routing

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Learning to Cooperate via Policy Search

UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
Artificial Intelligence: A Modern Approach

Artificial Intelligence: A Modern Approach
Optimizing information exchange in cooperative multi-agent systems

AAMAS '03 Proceedings of the second international joint conference on Autonomous agents and multiagent systems
Role allocation and reallocation in multiagent teams: towards a practical analysis

AAMAS '03 Proceedings of the second international joint conference on Autonomous agents and multiagent systems
Reinforcement learning by policy search

Reinforcement learning by policy search
Distributed Sensor Networks: A Multiagent Perspective

Distributed Sensor Networks: A Multiagent Perspective
Approximate Solutions for Partially Observable Stochastic Games with Common Payoffs

AAMAS '04 Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems - Volume 1
Decentralized Markov Decision Processes with Event-Driven Interactions

AAMAS '04 Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems - Volume 1
Communication for Improving Policy Computation in Distributed POMDPs

AAMAS '04 Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems - Volume 3
Planning, learning and coordination in multiagent decision processes

TARK '96 Proceedings of the 6th conference on Theoretical aspects of rationality and knowledge
Reasoning about joint beliefs for execution-time communication decisions

Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems
A polynomial algorithm for decentralized Markov decision processes with temporal constraints

Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems
An online POMDP algorithm for complex multiagent environments

Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems
Analyzing Myopic Approaches for Multi-Agent Communication

IAT '05 Proceedings of the IEEE/WIC/ACM International Conference on Intelligent Agent Technology
Game theoretic control for robot teams

Game theoretic control for robot teams
Heuristic anytime approaches to stochastic decision processes

Journal of Heuristics
Complexity analysis and optimal algorithms for decentralized decision making

Complexity analysis and optimal algorithms for decentralized decision making
Decentralized planning under uncertainty for teams of communicating agents

AAMAS '06 Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
Winning back the CUP for distributed POMDPs: planning over continuous belief spaces

AAMAS '06 Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
Mixed-integer linear programming for transition-independent decentralized MDPs

AAMAS '06 Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
Learning to communicate in a decentralized environment

Autonomous Agents and Multi-Agent Systems
Exploiting factored representations for decentralized execution in multiagent teams

Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
Letting loose a SPIDER on a network of POMDPs: generating quality guaranteed policies

Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
On opportunistic techniques for solving decentralized Markov decision processes with temporal constraints

Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
Q-value functions for decentralized POMDPs

Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
Dynamic Programming and Optimal Control, Vol. II

Dynamic Programming and Optimal Control, Vol. II
Exploiting locality of interaction in factored Dec-POMDPs

Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
Interaction-driven Markov games for decentralized multiagent planning under uncertainty

Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
Dynamic programming for partially observable stochastic games

AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
An iterative algorithm for solving constrained decentralized Markov decision processes

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Point-based dynamic programming for DEC-POMDPs

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Networked distributed POMDPs: a synthesis of distributed constraint optimization and POMDPs

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 1
Value-function approximations for partially observable Markov decision processes

Journal of Artificial Intelligence Research
The communicative multiagent team decision problem: analyzing teamwork theories and models

Journal of Artificial Intelligence Research
Efficient solution algorithms for factored MDPs

Journal of Artificial Intelligence Research
Decentralized control of cooperative systems: categorization and complexity analysis

Journal of Artificial Intelligence Research
Solving transition independent decentralized Markov decision processes

Journal of Artificial Intelligence Research
A framework for sequential planning in multi-agent settings

Journal of Artificial Intelligence Research
Perseus: randomized point-based value iteration for POMDPs

Journal of Artificial Intelligence Research
Memory-bounded dynamic programming for DEC-POMDPs

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Taming decentralized POMDPs: towards efficient policy computation for multiagent settings

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Point-based value iteration: an anytime algorithm for POMDPs

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Bounded policy iteration for decentralized POMDPs

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
An optimal best-first search algorithm for solving infinite horizon DEC-POMDPs

ECML'05 Proceedings of the 16th European conference on Machine Learning

Exploiting locality of interaction in factored Dec-POMDPs

Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
Interaction-driven Markov games for decentralized multiagent planning under uncertainty

Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
Lossless clustering of histories in decentralized POMDPs

Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Introduction to planning in multiagent systems

Multiagent and Grid Systems - Planning in multiagent systems
Review article: Synergizing reinforcement learning and game theory-A new direction for control

Applied Soft Computing
Heuristic search for identical payoff Bayesian games

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Point-based backup for decentralized POMDPs: complexity and new algorithms

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
An investigation into mathematical programming for finite horizon decentralized POMDPs

Journal of Artificial Intelligence Research
Online planning for multi-agent systems with bounded communication

Artificial Intelligence
On the power of global reward signals in reinforcement learning

MATES'11 Proceedings of the 9th German conference on Multiagent system technologies
Scaling up optimal heuristic search in Dec-POMDPs via incremental expansion

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
Scalable multiagent planning using probabilistic inference

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
Heuristic search of multiagent influence space

Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
QueryPOMDP: POMDP-based communication in multiagent systems

EUMAS'11 Proceedings of the 9th European conference on Multi-Agent Systems
Modeling information exchange opportunities for effective human-computer teamwork

Artificial Intelligence
Solving decentralized POMDP problems using genetic algorithms

Autonomous Agents and Multi-Agent Systems
Approximate solutions for factored Dec-POMDPs with many agents

Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
Incremental clustering and expansion for faster optimal planning in decentralized POMDPs

Journal of Artificial Intelligence Research
Optimally solving dec-POMDPs as continuous-state MDPs

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Sufficient plan-time statistics for decentralized POMDPs

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Decision-theoretic planning is a popular approach to sequential decision making problems, because it treats uncertainty in sensing and acting in a principled way. In single-agent frameworks like MDPs and POMDPs, planning can be carried out by resorting to Q-value functions: an optimal Q-value function Q* is computed in a recursive manner by dynamic programming, and then an optimal policy is extracted from Q*. In this paper we study whether similar Q-value functions can be defined for decentralized POMDP models (Dec-POMDPs), and how policies can be extracted from such value functions. We define two forms of the optimal Q-value function for Dec-POMDPs: one that gives a normative description as the Q-value function of an optimal pure joint policy and another one that is sequentially rational and thus gives a recipe for computation. This computation, however, is infeasible for all but the smallest problems. Therefore, we analyze various approximate Q-value functions that allow for efficient computation. We describe how they relate, and we prove that they all provide an upper bound to the optimal Q-value function Q*. Finally, unifying some previous approaches for solving Dec-POMDPs, we describe a family of algorithms for extracting policies from such Q-value functions, and perform an experimental evaluation on existing test problems, including a new firefighting benchmark problem.