Not all agents are equal: scaling up distributed POMDPs for agent networks

Authors:
Janusz Marecki;Tapana Gupta;Pradeep Varakantham;Milind Tambe;Makoto Yokoo
Affiliations:
University of Southern California, Los Angeles, CA;University of Southern California, Los Angeles, CA;Carnegie Mellon University, Pittsburgh, PA;University of Southern California, Los Angeles, CA;Kyushu University, Fukuoka, Japan
Venue:
Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
Year:
2008

Citing 11
Cited 10

Learning to Cooperate via Policy Search

UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
The Complexity of Decentralized Control of Markov Decision Processes

UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
Distributed Sensor Networks: A Multiagent Perspective

Distributed Sensor Networks: A Multiagent Perspective
Approximate Solutions for Partially Observable Stochastic Games with Common Payoffs

AAMAS '04 Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems - Volume 1
Letting loose a SPIDER on a network of POMDPs: generating quality guaranteed policies

Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
Dynamic programming for partially observable stochastic games

AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Networked distributed POMDPs: a synthesis of distributed constraint optimization and POMDPs

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 1
Solving transition independent decentralized Markov decision processes

Journal of Artificial Intelligence Research
A framework for sequential planning in multi-agent settings

Journal of Artificial Intelligence Research
Taming decentralized POMDPs: towards efficient policy computation for multiagent settings

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Bounded policy iteration for decentralized POMDPs

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence

Constraint-based dynamic programming for decentralized POMDPs with structured interactions

Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Introducing Communication in Dis-POMDPs with Finite State Machines

WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 02
Evaluation of learning algorithms for optimal policy representation in sensor-network based human health monitoring systems

ICICS'09 Proceedings of the 7th international conference on Information, communications and signal processing
Introducing communication in Dis-POMDPs with locality of interaction

Web Intelligence and Agent Systems
Optimizing fixed-size stochastic controllers for POMDPs and decentralized POMDPs

Autonomous Agents and Multi-Agent Systems
Multi-agent role allocation: issues, approaches, and multiple perspectives

Autonomous Agents and Multi-Agent Systems
Social Model Shaping for Solving Generic DEC-POMDPs

WI-IAT '11 Proceedings of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Volume 02
Scalable multiagent planning using probabilistic inference

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
Approximate solutions for factored Dec-POMDPs with many agents

Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
Incremental clustering and expansion for faster optimal planning in decentralized POMDPs

Journal of Artificial Intelligence Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

Many applications of networks of agents, including mobile sensor networks, unmanned air vehicles, autonomous underwater vehicles, involve 100s of agents acting collaboratively under uncertainty. Distributed Partially Observable Markov Decision Problems (Distributed POMDPs) are well-suited to address such applications, but so far, only limited scale-ups of up to five agents have been demonstrated. This paper escalates the scale-up, presenting an algorithm called FANS, increasing the number of agents in distributed POMDPs for the first time into double digits. FANS is founded on finite state machines (FSMs) for policy representation and expoits these FSMs to provide three key contributions: (i) Not all agents within an agent network need the same expressivity of policy representation; FANS introduces novel heuristics to automatically vary the FSM size in different agents for scaleup; (ii) FANS illustrates efficient integration of its FSM-based policy search within algorithms that exploit agent network structure; (iii) FANS provides significant speedups in policy evaluation and heuristic computations within the network algorithms by exploiting the FSMs for dynamic programming. Experimental results show not only orders of magnitude improvements over previous best known algorithms for smaller-scale domains (with similar solution quality), but also a scale-up into double digits in terms of numbers of agents.