Decentralized Bayesian reinforcement learning for online agent collaboration

Authors:
W. T. L. Teacy;G. Chalkiadakis;A. Farinelli;A. Rogers;N. R. Jennings;S. McClean;G. Parr
Affiliations:
University of Southampton, UK;Technical University of Crete, Greece;University of Verona, Italy;Technical University of Crete, Greece;Technical University of Crete, Greece;University of Ulster, UK;University of Ulster, UK
Venue:
Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Year:
2012

Citing 14
Cited 1

Bayesian Q-learning

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Sequential Optimality and Coordination in Multiagent Systems

IJCAI '99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence
Coordinated Reinforcement Learning

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Collaborative Multiagent Reinforcement Learning by Payoff Propagation

The Journal of Machine Learning Research
Decentralised coordination of low-power embedded devices using the max-sum algorithm

Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 2
Networked distributed POMDPs: a synthesis of distributed constraint optimization and POMDPs

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 1
Max-norm projections for factored MDPs

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 1
Model based Bayesian exploration

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
The complexity of decentralized control of Markov decision processes

UAI'00 Proceedings of the Sixteenth conference on Uncertainty in artificial intelligence
Sequentially optimal repeated coalition formation under uncertainty

Autonomous Agents and Multi-Agent Systems
The generalized distributive law

IEEE Transactions on Information Theory
Factor graphs and the sum-product algorithm

IEEE Transactions on Information Theory
On the optimality of solutions of the max-product belief-propagation algorithm in arbitrary graphs

IEEE Transactions on Information Theory

Agent-based decentralised coordination for sensor networks using the max-sum algorithm

Autonomous Agents and Multi-Agent Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Solving complex but structured problems in a decentralized manner via multiagent collaboration has received much attention in recent years. This is natural, as on one hand, multiagent systems usually possess a structure that determines the allowable interactions among the agents; and on the other hand, the single most pressing need in a cooperative multiagent system is to coordinate the local policies of autonomous agents with restricted capabilities to serve a system-wide goal. The presence of uncertainty makes this even more challenging, as the agents face the additional need to learn the unknown environment parameters while forming (and following) local policies in an online fashion. In this paper, we provide the first Bayesian reinforcement learning (BRL) approach for distributed coordination and learning in a cooperative multiagent system by devising two solutions to this type of problem. More specifically, we show how the Value of Perfect Information (VPI) can be used to perform efficient decentralised exploration in both model-based and model-free BRL, and in the latter case, provide a closed form solution for VPI, correcting a decade old result by Dearden, Friedman and Russell. To evaluate these solutions, we present experimental results comparing their relative merits, and demonstrate empirically that both solutions outperform an existing multiagent learning method, representative of the state-of-the-art.