Distributed model shaping for scaling to decentralized POMDPs with hundreds of agents

Authors:
Prasanna Velagapudi;Pradeep Varakantham;Katia Sycara;Paul Scerri
Affiliations:
Carnegie Mellon University, Pittsburgh, PA;Singapore Management Univ., Singapore;Carnegie Mellon University, Pittsburgh, PA;Carnegie Mellon University, Pittsburgh, PA
Venue:
The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 3
Year:
2011

Citing 9
Cited 10

A scalable, distributed algorithm for efficient task allocation

Proceedings of the first international joint conference on Autonomous agents and multiagent systems: part 3
A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes

Machine Learning
The Complexity of Decentralized Control of Markov Decision Processes

Mathematics of Operations Research
Allocating tasks in extreme teams

Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems
Exploiting belief bounds: practical POMDPs for personal assistant agents

Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems
Exploiting locality of interaction in factored Dec-POMDPs

Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
Networked distributed POMDPs: a synthesis of distributed constraint optimization and POMDPs

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 1
Solving transition independent decentralized Markov decision processes

Journal of Artificial Intelligence Research
Distributed stochastic search and distributed breakout: properties, comparison and applications to constraint optimization problems in sensor networks

Artificial Intelligence - Special issue: Distributed constraint satisfaction

Social Model Shaping for Solving Generic DEC-POMDPs

WI-IAT '11 Proceedings of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Volume 02
Heuristic search of multiagent influence space

Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
Prioritized shaping of models for solving DEC-POMDPs

Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 3
Lagrangian Relaxation for Large-Scale Multi-agent Planning

WI-IAT '12 Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 02
Organizational design principles and techniques for decision-theoretic agents

Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
Producing efficient error-bounded solutions for transition independent decentralized mdps

Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
Approximate solutions for factored Dec-POMDPs with many agents

Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
Incremental clustering and expansion for faster optimal planning in decentralized POMDPs

Journal of Artificial Intelligence Research
Automated generation of interaction graphs for value-factored dec-POMDPs

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Agent-based decentralised coordination for sensor networks using the max-sum algorithm

Autonomous Agents and Multi-Agent Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

The use of distributed POMDPs for cooperative teams has been severely limited by the incredibly large joint policyspace that results from combining the policy-spaces of the individual agents. However, much of the computational cost of exploring the entire joint policy space can be avoided by observing that in many domains important interactions between agents occur in a relatively small set of scenarios, previously defined as coordination locales (CLs) [11]. Moreover, even when numerous interactions might occur, given a set of individual policies there are relatively few actual interactions. Exploiting this observation and building on an existing model shaping algorithm, this paper presents D-TREMOR, an algorithm in which cooperative agents iteratively generate individual policies, identify and communicate possible interactions between their policies, shape their models based on this information and generate new policies. D-TREMOR has three properties that jointly distinguish it from previous DEC-POMDP work: (1) it is completely distributed; (2) it is scalable (allowing 100 agents to compute a "good" joint policy in under 6 hours) and (3) it has low communication overhead. D-TREMOR complements these traits with the following key contributions, which ensure improved scalability and solution quality: (a) techniques to ensure convergence; (b) faster approaches to detect and evaluate CLs; (c) heuristics to capture dependencies between CLs; and (d) novel shaping heuristics to aggregate effects of CLs. While the resulting policies are not globally optimal, empirical results show that agents have policies that effectively manage uncertainty and the joint policy is better than policies generated by independent solvers.