Taming decentralized POMDPs: towards efficient policy computation for multiagent settings

Authors:
R. Nair;M. Tambe;M. Yokoo;D. Pynadath;S. Marsella
Affiliations:
Computer Science Dept., University of Southern California, Los Angeles CA;Computer Science Dept., University of Southern California, Los Angeles CA;Coop. Computing Research Grp., NTT Comm. Sc. Labs, Kyoto, Japan;Information Sciences Institute, University of Southern California, Marina del Rey, CA;Information Sciences Institute, University of Southern California, Marina del Rey, CA
Venue:
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Year:
2003

Citing 9
Cited 87

The complexity of Markov decision processes

Mathematics of Operations Research
Planning and acting in partially observable stochastic domains

Artificial Intelligence
Communication decisions in multi-agent cooperation: model and experiments

Proceedings of the fifth international conference on Autonomous agents
A heuristic approach for solving decentralized-POMDP: assessment on the pursuit problem

Proceedings of the 2002 ACM symposium on Applied computing
Learning to Cooperate via Policy Search

UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
The Complexity of Decentralized Control of Markov Decision Processes

UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
Planning, learning and coordination in multiagent decision processes

TARK '96 Proceedings of the 6th conference on Theoretical aspects of rationality and knowledge
The communicative multiagent team decision problem: analyzing teamwork theories and models

Journal of Artificial Intelligence Research
Incremental pruning: a simple, fast, exact method for partially observable Markov decision processes

UAI'97 Proceedings of the Thirteenth conference on Uncertainty in artificial intelligence

Approximate Solutions for Partially Observable Stochastic Games with Common Payoffs

AAMAS '04 Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems - Volume 1
Protocol/Mechanism Design for Cooperation/Competition

AAMAS '04 Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems - Volume 1
Communication for Improving Policy Computation in Distributed POMDPs

AAMAS '04 Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems - Volume 3
Conflicts in teamwork: hybrids to the rescue

Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems
Reasoning about joint beliefs for execution-time communication decisions

Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems
Cooperative Multi-Agent Learning: The State of the Art

Autonomous Agents and Multi-Agent Systems
Decentralized planning under uncertainty for teams of communicating agents

AAMAS '06 Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
Security in multiagent systems by policy randomization

AAMAS '06 Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
Stochastic planning for weakly-coupled distributed agents

AAMAS '06 Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
Selecting informative actions improves cooperative multiagent learning

AAMAS '06 Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
Exact solutions of interactive POMDPs using behavioral equivalence

AAMAS '06 Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
Mixed-integer linear programming for transition-independent decentralized MDPs

AAMAS '06 Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
Model-Based Reinforcement Learning for Partially Observable Games with Sampling-Based State Estimation

Neural Computation
Letting loose a SPIDER on a network of POMDPs: generating quality guaranteed policies

Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
On opportunistic techniques for solving decentralized Markov decision processes with temporal constraints

Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
Q-value functions for decentralized POMDPs

Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
Distributed intrusion detection in partially observable Markov decision processes

Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
Subjective approximate solutions for decentralized POMDPs

Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
Dynamic multiagent probabilistic inference

International Journal of Approximate Reasoning
Not all agents are equal: scaling up distributed POMDPs for agent networks

Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
Centralized Regulation of Social Exchanges Between Personality-Based Agents

Coordination, Organizations, Institutions, and Norms in Agent Systems II
Commitment-based service coordination

International Journal of Agent-Oriented Software Engineering
Graphical models for interactive POMDPs: representations and solutions

Autonomous Agents and Multi-Agent Systems
Lossless clustering of histories in decentralized POMDPs

Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Achieving goals in decentralized POMDPs

Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Reward shaping for valuing communications during multi-agent coordination

Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Planning with continuous resources for agent teams

Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
Exploiting locality of interactions using a policy-gradient approach in multiagent learning

Proceedings of the 2008 conference on ECAI 2008: 18th European Conference on Artificial Intelligence
Dynamic programming for partially observable stochastic games

AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
An iterative algorithm for solving constrained decentralized Markov decision processes

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Point-based dynamic programming for DEC-POMDPs

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
QUICR-learning for multi-agent coordination

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Networked distributed POMDPs: a synthesis of distributed constraint optimization and POMDPs

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 1
Agent influence as a predictor of difficulty for decentralized problem-solving

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
Generalized point based value iteration for interactive POMDPs

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 1
Decentralized control of cooperative systems: categorization and complexity analysis

Journal of Artificial Intelligence Research
Solving transition independent decentralized Markov decision processes

Journal of Artificial Intelligence Research
Hybrid BDI-POMDP framework for multiagent teaming

Journal of Artificial Intelligence Research
A framework for sequential planning in multi-agent settings

Journal of Artificial Intelligence Research
Communication-based decomposition mechanisms for decentralized MDPs

Journal of Artificial Intelligence Research
Optimal and approximate Q-value functions for decentralized POMDPs

Journal of Artificial Intelligence Research
Policy iteration for decentralized control of Markov decision processes

Journal of Artificial Intelligence Research
Average-reward decentralized Markov decision processes

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Memory-bounded dynamic programming for DEC-POMDPs

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Team programming in Golog under partial observability

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Introducing Communication in Dis-POMDPs with Finite State Machines

WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 02
Myopic and Non-myopic Communication under Partial Observability

WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 02
A bilinear programming approach for multiagent planning

Journal of Artificial Intelligence Research
Planning for weakly-coupled partially observable stochastic games

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Networked distributed POMDPs: a synergy of distributed constraint optimization and POMDPs

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Speeding up exact solutions of interactive dynamic influence diagrams using action equivalence

IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Efficient and distributable methods for solving the multiagent plan coordination problem

Multiagent and Grid Systems - Planning in multiagent systems
RoboCup Rescue as multiagent task allocation among teams: experiments with task interdependencies

Autonomous Agents and Multi-Agent Systems
Review article: Synergizing reinforcement learning and game theory-A new direction for control

Applied Soft Computing
Game-theoretic agent programming in Golog under partial observability

KI'06 Proceedings of the 29th annual German conference on Artificial intelligence
Point-based policy generation for decentralized POMDPs

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Point-based backup for decentralized POMDPs: complexity and new algorithms

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
From policies to influences: a framework for nonlocal abstraction in transition-dependent Dec-POMDP agents

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Valuing search and communication in partially-observable coordination problems

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Optimizing fixed-size stochastic controllers for POMDPs and decentralized POMDPs

Autonomous Agents and Multi-Agent Systems
An investigation into mathematical programming for finite horizon decentralized POMDPs

Journal of Artificial Intelligence Research
Point-based bounded policy iteration for decentralized POMDPs

PRICAI'10 Proceedings of the 11th Pacific Rim international conference on Trends in artificial intelligence
Online planning for multi-agent systems with bounded communication

Artificial Intelligence
Two decades of multiagent teamwork research: past, present, and future

CARE@AI'09/CARE@IAT'10 Proceedings of the CARE@AI 2009 and CARE@IAT 2010 international conference on Collaborative agents - research and development
Towards a unifying characterization for quantifying weak coupling in dec-POMDPs

The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Solving efficiently Decentralized MDPs with temporal and resource constraints

Autonomous Agents and Multi-Agent Systems
Social Model Shaping for Solving Generic DEC-POMDPs

WI-IAT '11 Proceedings of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Volume 02
Game-theoretic reasoning about actions in nonmonotonic causal theories

LPNMR'05 Proceedings of the 8th international conference on Logic Programming and Nonmonotonic Reasoning
Regulating social exchanges between personality-based non-transparent agents

MICAI'06 Proceedings of the 5th Mexican international conference on Artificial Intelligence
Coordinating teams in uncertain environments: a hybrid BDI-POMDP approach

ProMAS'04 Proceedings of the Second international conference on Programming Multi-Agent Systems
An optimal best-first search algorithm for solving infinite horizon DEC-POMDPs

ECML'05 Proceedings of the 16th European conference on Machine Learning
Abstract policy evaluation for reactive agents

SARA'05 Proceedings of the 6th international conference on Abstraction, Reformulation and Approximation
Exploiting symmetries for single- and multi-agent Partially Observable Stochastic Domains

Artificial Intelligence
A POMDP model for guiding taxi cruising in a congested urban city

MICAI'11 Proceedings of the 10th Mexican international conference on Advances in Artificial Intelligence - Volume Part I
An overview of cooperative and competitive multiagent learning

LAMAS'05 Proceedings of the First international conference on Learning and Adaption in Multi-Agent Systems
Scalable multiagent planning using probabilistic inference

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
Heuristic search of multiagent influence space

Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
Delayed observation planning in partially observable domains

Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 3
Exploiting model equivalences for solving interactive dynamic influence diagrams

Journal of Artificial Intelligence Research
QueryPOMDP: POMDP-based communication in multiagent systems

EUMAS'11 Proceedings of the 9th European conference on Multi-Agent Systems
Solving decentralized POMDP problems using genetic algorithms

Autonomous Agents and Multi-Agent Systems
Incremental clustering and expansion for faster optimal planning in decentralized POMDPs

Journal of Artificial Intelligence Research
ACTIDS: an active strategy for detecting and localizing network attacks

Proceedings of the 2013 ACM workshop on Artificial intelligence and security
Sufficient plan-time statistics for decentralized POMDPs

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Monte-Carlo expectation maximization for decentralized POMDPs

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Interactive POMDP lite: towards practical planning to predict and exploit intentions for interacting with self-interested agents

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Survey Control: A perspective

Automatica (Journal of IFAC)

Quantified Score

Hi-index	0.00

Visualization

Abstract

The problem of deriving joint policies for a group of agents that maximize some joint reward function can be modeled as a decentralized partially observable Markov decision process (POMDP). Yet, despite the growing importance and applications of decentralized POMDP models in the multiagents arena, few algorithms have been developed for efficiently deriving joint policies for these models. This paper presents a new class of locally optimal algorithms called "Joint Equilibrium-based search for policies (JESP)". We first describe an exhaustive version of JESP and subsequently a novel dynamic programming approach to JESP. Our complexity analysis reveals the potential for exponential speedups due to the dynamic programming approach. These theoretical results are verified via empirical comparisons of the two JESP versions with each other and with a globally optimal brute-force search algorithm. Finally, we prove piece-wise linear and convexity (PWLC) properties, thus taking steps towards developing algorithms for continuous belief states.