Reinforcement learning approaches to coordination in cooperative multi-agent systems

Authors:
Spiros Kapetanakis;Daniel Kudenko;Malcolm J. A. Strens
Affiliations:
Department of Computer Science, University of York, York, UK;Department of Computer Science, University of York, York, UK;Guidance and Imaging Solutions, QinetiQ, Hampshire, UK
Venue:
Adaptive agents and multi-agent systems
Year:
2003

Citing 7
Cited 0

Learning to coordinate without sharing information

AAAI '94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 1)
The dynamics of reinforcement learning in cooperative multiagent systems

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Convergence Results for Single-Step On-PolicyReinforcement-Learning Algorithms

Machine Learning
An Algorithm for Distributed Reinforcement Learning in Cooperative Multi-Agent Systems

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Reinforcement learning: a survey

Journal of Artificial Intelligence Research
Learning to coordinate actions in multi-agent systems

IJCAI'93 Proceedings of the 13th international joint conference on Artifical intelligence - Volume 1
Sequential optimality and coordination in multiagent systems

IJCAI'99 Proceedings of the 16th international joint conference on Artifical intelligence - Volume 1

Quantified Score

Hi-index	0.00

Visualization

Abstract

We report on an investigation of reinforcement learning techniques for the learning of coordination in cooperative multi-agent systems. Specifically, we focus on two novel approaches: one is based on a new action selection strategy for Q-learning [10], and the other is based on model estimation with a shared action-selection protocol. The new techniques are applicable to scenarios where mutual observation of actions is not possible. To date, reinforcement learning approaches for such independent agents did not guarantee convergence to the optimal joint action in scenarios with high miscoordination costs. We improve on previous results [2] by demonstrating empirically that our extension causes the agents to converge almost always to the optimal joint action even in these difficult cases.