Evolutionary dynamics of regret minimization

Authors:
Tomas Klos;Gerrit Jan Van Ahee;Karl Tuyls
Affiliations:
Delft University of Technology, Delft, The Netherlands;Yes2web, Rotterdam, The Netherlands;Maastricht University, Maastricht, The Netherlands
Venue:
ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part II
Year:
2010

Citing 14
Cited 2

Learning automata: an introduction

Learning automata: an introduction
Technical Note: \cal Q-Learning

Machine Learning
The dynamics of reinforcement learning in cooperative multiagent systems

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Convergence Problems of General-Sum Multiagent Reinforcement Learning

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
A selection-mutation model for q-learning in multi-agent systems

AAMAS '03 Proceedings of the second international joint conference on Autonomous agents and multiagent systems
Cooperative Multi-Agent Learning: The State of the Art

Autonomous Agents and Multi-Agent Systems
An Evolutionary Dynamical Analysis of Multi-Agent Learning in Iterated Games

Autonomous Agents and Multi-Agent Systems
AWESOME: A general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents

Machine Learning
If multi-agent learning is the answer, what is the question?

Artificial Intelligence
What evolutionary game theory tells us about multiagent learning

Artificial Intelligence
State-coupled replicator dynamics

Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
Reinforcement learning: a survey

Journal of Artificial Intelligence Research

A common gradient in multi-agent reinforcement learning

Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 3
Multi-agent learning and the reinforcement gradient

EUMAS'11 Proceedings of the 9th European conference on Multi-Agent Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Learning in multi-agent systems (MAS) is a complex task. Current learning theory for single-agent systems does not extend to multi-agent problems. In a MAS the reinforcement an agent receives may depend on the actions taken by the other agents present in the system. Hence, the Markov property no longer holds and convergence guarantees are lost. Currently there does not exist a general formal theory describing and elucidating the conditions under which algorithms for multi-agent learning (MAL) are successful. Therefore it is important to fully understand the dynamics of multi-agent reinforcement learning, and to be able to analyze learning behavior in terms of stability and resilience of equilibria. Recent work has considered the replicator dynamics of evolutionary game theory for this purpose. In this paper we contribute to this framework. More precisely, we formally derive the evolutionary dynamics of the Regret Minimization polynomial weights learning algorithm, which will be described by a system of differential equations. Using these equations we can easily investigate parameter settings and analyze the dynamics of multiple concurrently learning agents using regret minimization. In this way it is clear why certain attractors are stable and potentially preferred over others, and what the basins of attraction look like. Furthermore, we experimentally show that the dynamics predict the real learning behavior and we test the dynamics also in nonself play, comparing the polynomial weights algorithm against the previously derived dynamics of Q-learning and various Linear Reward algorithms in a set of benchmark normal form games.