Q-value functions for decentralized POMDPs

Authors:
Frans A. Oliehoek;Nikos Vlassis
Affiliations:
University of Amsterdam, Kruislaan, Amsterdam, The Netherlands;University of Amsterdam, Kruislaan, Amsterdam, The Netherlands
Venue:
Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
Year:
2007

Citing 14
Cited 7

The complexity of Markov decision processes

Mathematics of Operations Research
Planning and acting in partially observable stochastic domains

Artificial Intelligence
Multiagent systems: a modern approach to distributed artificial intelligence

Multiagent systems: a modern approach to distributed artificial intelligence
Markov Decision Processes: Discrete Stochastic Dynamic Programming

Markov Decision Processes: Discrete Stochastic Dynamic Programming
Multiagent Systems: A Survey from a Machine Learning Perspective

Autonomous Robots
The Complexity of Decentralized Control of Markov Decision Processes

Mathematics of Operations Research
Approximate Solutions for Partially Observable Stochastic Games with Common Payoffs

AAMAS '04 Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems - Volume 1
Planning, learning and coordination in multiagent decision processes

TARK '96 Proceedings of the 6th conference on Theoretical aspects of rationality and knowledge
Reasoning about joint beliefs for execution-time communication decisions

Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems
Collaborative Multiagent Reinforcement Learning by Payoff Propagation

The Journal of Machine Learning Research
A Concise Introduction to Multiagent Systems and Distributed Artificial Intelligence

A Concise Introduction to Multiagent Systems and Distributed Artificial Intelligence
Dynamic programming for partially observable stochastic games

AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Solving transition independent decentralized Markov decision processes

Journal of Artificial Intelligence Research
Taming decentralized POMDPs: towards efficient policy computation for multiagent settings

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence

Exploiting locality of interaction in factored Dec-POMDPs

Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
Optimal and approximate Q-value functions for decentralized POMDPs

Journal of Artificial Intelligence Research
Point-based policy generation for decentralized POMDPs

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Online planning for multi-agent systems with bounded communication

Artificial Intelligence
Exploiting symmetries for single- and multi-agent Partially Observable Stochastic Domains

Artificial Intelligence
Solving decentralized POMDP problems using genetic algorithms

Autonomous Agents and Multi-Agent Systems
Incremental clustering and expansion for faster optimal planning in decentralized POMDPs

Journal of Artificial Intelligence Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

Planning in single-agent models like MDPs and POMDPs can be carried out by resorting to Q-value functions: a (near-) optimal Q-value function is computed in a recursive manner by dynamic programming, and then a policy is extracted from this value function. In this paper we study whether similar Q-value functions can be defined in decentralized POMDP models (Dec-POMDPs), what the cost of computing such value functions is, and how policies can be extracted from such value functions. Using the framework of Bayesian games, we argue that searching for the optimal Q-value function may be as costly as exhaustive policy search. Then we analyze various approximate Q-value functions that allow efficient computation. Finally, we describe a family of algorithms for extracting policies from such Q-value functions.