Product Distribution Theory for Control of Multi-Agent Systems

Authors:
Chiu Fan Lee;David H. Wolpert
Affiliations:
Oxford University;NASA Ames Research Center
Venue:
AAMAS '04 Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems - Volume 2
Year:
2004

Citing 3
Cited 6

Information Theory, Inference & Learning Algorithms

Information Theory, Inference & Learning Algorithms
Adaptive, Distributed Control of Constrained Multi-Agent Systems

AAMAS '04 Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems - Volume 3
Collective intelligence, data routing and braess' paradox

Journal of Artificial Intelligence Research

Adaptive, Distributed Control of Constrained Multi-Agent Systems

AAMAS '04 Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems - Volume 3
Reactivity and Safe Learning in Multi-Agent Systems

Adaptive Behavior - Animals, Animats, Software Agents, Robots, Adaptive Systems
On similarities between inference in game theory and machine learning

Journal of Artificial Intelligence Research
Optimization-Based Collision Avoidance for Cooperating Airplanes

WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 02
Distributed faulty sensor detection

GLOBECOM'09 Proceedings of the 28th IEEE conference on Global telecommunications
Surveillance of unmanned aerial vehicles using probability collectives

HoloMAS'11 Proceedings of the 5th international conference on Industrial applications of holonic and multi-agent systems for manufacturing

Quantified Score

Hi-index	0.00

Visualization

Abstract

Product Distribution (PD) theory is a new framework for controlling Multi-Agent Systems (MASýs). First we review one motivation of PD theory, as the information-theoretic extension of conventional full-rationality game theory to the case of bounded rational agents. In this extension the equilibrium of the game is the optimizer of a Lagrangian of the (probability distribution of) the joint state of the agents. Accordingly we can consider a team game havbing a shared utility which is a performance measure of the behavior of the MAS. For such a scenario the game is at equilibrium 驴 the Lagrangian is optimized 驴 when the joint distribution of the agents optimizes the systemýs expected performance. One common way to find that equilibrium is to have each agent run a reinforcement learning algorithm. Here we investigate the alternative of exploiting PD theory to run gradient descent on the Lagrangian. We present computer experiments validating some of the predictions of PD theory for how best to do that gradient descent. We also demonstrate how PD theory can improve performance even when we are not allowed to rerun the MAS from different initial conditions, a requirement implicit in some previous work.