A convergent multiagent reinforcement learning approach for a subclass of cooperative stochastic games

Authors:
Thomas Kemmerich;Hans Kleine Büning
Affiliations:
International Graduate School Dynamic Intelligent Systems, University of Paderborn, Paderborn, Germany;Department of Computer Science, University of Paderborn, Paderborn, Germany
Venue:
ALA'11 Proceedings of the 11th international conference on Adaptive and Learning Agents
Year:
2011

Citing 11
Cited 2

Technical Note: \cal Q-Learning

Machine Learning
The dynamics of reinforcement learning in cooperative multiagent systems

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Machine Learning

Machine Learning
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
The Complexity of Decentralized Control of Markov Decision Processes

Mathematics of Operations Research
An Algorithm for Distributed Reinforcement Learning in Cooperative Multi-Agent Systems

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Reinforcement learning of coordination in cooperative multi-agent systems

Eighteenth national conference on Artificial intelligence
Cooperative Multi-Agent Learning: The State of the Art

Autonomous Agents and Multi-Agent Systems
An analysis of reinforcement learning with function approximation

Proceedings of the 25th international conference on Machine learning
Reinforcement learning: a survey

Journal of Artificial Intelligence Research
Planning and acting in partially observable stochastic domains

Artificial Intelligence

On the power of global reward signals in reinforcement learning

MATES'11 Proceedings of the 9th German conference on Multiagent system technologies
Coordination in Large Multiagent Reinforcement Learning Problems

WI-IAT '11 Proceedings of the 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Volume 02

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present a distributed Q-Learning approach for independently learning agents in a subclass of cooperative stochastic games called cooperative sequential stage games. In this subclass, several stage games are played one after the other. We also propose a transformation function for that class and prove that transformed and original games have the same set of optimal joint strategies. Under the condition that the played game is obtained through transformation, it will be proven that our approach converges to an optimal joint strategy for the last stage game of the transformed game and thus also for the original game. In addition, the ability to converge to ε-optimal joint strategies for each of the stage games is shown. The environment in our approach does not need to present a state signal to the agents. Instead, by the use of the aforementioned transformation function, the agents gain knowledge about state changes from an engineered reward. This allows agents to omit storing strategies for each single state, but to use only one strategy that is adapted to the currently played stage game. Thus, the algorithm has very low space requirements and its complexity is comparable to single agent Q-Learning. Besides theoretical analyses, we also underline the convergence properties with some experiments.