Online planning for multi-agent systems with bounded communication

Authors:
Feng Wu;Shlomo Zilberstein;Xiaoping Chen
Affiliations:
School of Computer Science, University of Science and Technology of China, Jinzhai Road 96, Hefei, Anhui 230026, China and Department of Computer Science, University of Massachusetts at Amherst, 1 ...;Department of Computer Science, University of Massachusetts at Amherst, 140 Governors Drive, Amherst, MA 01003, USA;School of Computer Science, University of Science and Technology of China, Jinzhai Road 96, Hefei, Anhui 230026, China
Venue:
Artificial Intelligence
Year:
2011

Citing 47
Cited 7

Task decomposition, dynamic role assignment, and low-bandwidth communication for real-time strategic teamwork

Artificial Intelligence - Special issue on Robocop: the first step
Communication decisions in multi-agent cooperation: model and experiments

Proceedings of the fifth international conference on Autonomous agents
Wireless sensor networks: a survey

Computer Networks: The International Journal of Computer and Telecommunications Networking
Learning to Cooperate via Policy Search

UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
The Complexity of Decentralized Control of Markov Decision Processes

UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
Optimizing information exchange in cooperative multi-agent systems

AAMAS '03 Proceedings of the second international joint conference on Autonomous agents and multiagent systems
Minimizing communication cost in a distributed Bayesian network using a decentralized MDP

AAMAS '03 Proceedings of the second international joint conference on Autonomous agents and multiagent systems
Approximate Solutions for Partially Observable Stochastic Games with Common Payoffs

AAMAS '04 Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems - Volume 1
Decentralized Markov Decision Processes with Event-Driven Interactions

AAMAS '04 Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems - Volume 1
Communication for Improving Policy Computation in Distributed POMDPs

AAMAS '04 Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems - Volume 3
Learning to Communicate and Act Using Hierarchical Reinforcement Learning

AAMAS '04 Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems - Volume 3
Planning, learning and coordination in multiagent decision processes

TARK '96 Proceedings of the 6th conference on Theoretical aspects of rationality and knowledge
Improving Coordination with Communication in Multi-Agent Reinforcement Learning

ICTAI '04 Proceedings of the 16th IEEE International Conference on Tools with Artificial Intelligence
Reasoning about joint beliefs for execution-time communication decisions

Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems
A polynomial algorithm for decentralized Markov decision processes with temporal constraints

Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems
An online POMDP algorithm for complex multiagent environments

Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems
Analyzing Myopic Approaches for Multi-Agent Communication

IAT '05 Proceedings of the IEEE/WIC/ACM International Conference on Intelligent Agent Technology
Game theoretic control for robot teams

Game theoretic control for robot teams
Decentralized planning under uncertainty for teams of communicating agents

AAMAS '06 Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
Learning to communicate in a decentralized environment

Autonomous Agents and Multi-Agent Systems
Exploiting factored representations for decentralized execution in multiagent teams

Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
On opportunistic techniques for solving decentralized Markov decision processes with temporal constraints

Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
Q-value functions for decentralized POMDPs

Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
Value-based observation compression for DEC-POMDPs

Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
Exploiting locality of interaction in factored Dec-POMDPs

Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
Interaction-driven Markov games for decentralized multiagent planning under uncertainty

Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
Formal models and algorithms for decentralized decision making under uncertainty

Autonomous Agents and Multi-Agent Systems
Execution-time communication decisions for coordination of multi-agent teams

Execution-time communication decisions for coordination of multi-agent teams
Point-based incremental pruning heuristic for solving finite-horizon DEC-POMDPs

Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Lossless clustering of histories in decentralized POMDPs

Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Achieving goals in decentralized POMDPs

Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Reward shaping for valuing communications during multi-agent coordination

Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Dynamic programming for partially observable stochastic games

AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
An iterative algorithm for solving constrained decentralized Markov decision processes

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Point-based dynamic programming for DEC-POMDPs

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Networked distributed POMDPs: a synthesis of distributed constraint optimization and POMDPs

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 1
The communicative multiagent team decision problem: analyzing teamwork theories and models

Journal of Artificial Intelligence Research
Decentralized control of cooperative systems: categorization and complexity analysis

Journal of Artificial Intelligence Research
Solving transition independent decentralized Markov decision processes

Journal of Artificial Intelligence Research
Optimal and approximate Q-value functions for decentralized POMDPs

Journal of Artificial Intelligence Research
Policy iteration for decentralized control of Markov decision processes

Journal of Artificial Intelligence Research
Towards flexible teamwork

Journal of Artificial Intelligence Research
Memory-bounded dynamic programming for DEC-POMDPs

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Taming decentralized POMDPs: towards efficient policy computation for multiagent settings

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Myopic and Non-myopic Communication under Partial Observability

WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 02
Bounded policy iteration for decentralized POMDPs

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
A Comprehensive Survey of Multiagent Reinforcement Learning

IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews

Online planning for ad hoc autonomous agent teams

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume One
Scaling up optimal heuristic search in Dec-POMDPs via incremental expansion

IJCAI'11 Proceedings of the Twenty-Second international joint conference on Artificial Intelligence - Volume Volume Three
Solving decentralized POMDP problems using genetic algorithms

Autonomous Agents and Multi-Agent Systems
Learning Communication in Interactive Dynamic Influence Diagrams

WI-IAT '12 Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 02
Incremental clustering and expansion for faster optimal planning in decentralized POMDPs

Journal of Artificial Intelligence Research
Bimodal switching for online planning in multiagent settings

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
WrightEagle and UT Austin villa: RoboCup 2011 simulation league champions

Robot Soccer World Cup XV

Quantified Score

Hi-index	0.00

Visualization

Abstract

We propose an online algorithm for planning under uncertainty in multi-agent settings modeled as DEC-POMDPs. The algorithm helps overcome the high computational complexity of solving such problems offline. The key challenges in decentralized operation are to maintain coordinated behavior with little or no communication and, when communication is allowed, to optimize value with minimal communication. The algorithm addresses these challenges by generating identical conditional plans based on common knowledge and communicating only when history inconsistency is detected, allowing communication to be postponed when necessary. To be suitable for online operation, the algorithm computes good local policies using a new and fast local search method implemented using linear programming. Moreover, it bounds the amount of memory used at each step and can be applied to problems with arbitrary horizons. The experimental results confirm that the algorithm can solve problems that are too large for the best existing offline planning algorithms and it outperforms the best online method, producing much higher value with much less communication in most cases. The algorithm also proves to be effective when the communication channel is imperfect (periodically unavailable). These results contribute to the scalability of decision-theoretic planning in multi-agent settings.