A selection-mutation model for q-learning in multi-agent systems

Authors:
Karl Tuyls;Katja Verbeeck;Tom Lenaerts
Affiliations:
Computational Modeling Lab, Brussels, Belgium;Computational Modeling Lab, Brussels, Belgium;Computational Modeling Lab, Brussels, Belgium
Venue:
AAMAS '03 Proceedings of the second international joint conference on Autonomous agents and multiagent systems
Year:
2003

Citing 2
Cited 20

Technical Note: \cal Q-Learning

Machine Learning
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning

Predicting agent strategy mix of evolving populations

Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems
Cooperative Multi-Agent Learning: The State of the Art

Autonomous Agents and Multi-Agent Systems
An Evolutionary Dynamical Analysis of Multi-Agent Learning in Iterated Games

Autonomous Agents and Multi-Agent Systems
What evolutionary game theory tells us about multiagent learning

Artificial Intelligence
Theoretical advantages of lenient Q-learners: an evolutionary game theoretic perspective

Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
Theoretical Advantages of Lenient Learners: An Evolutionary Game Theoretic Perspective

The Journal of Machine Learning Research
Switching dynamics of multi-agent learning

Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
Formalizing Multi-state Learning Dynamics

WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 02
Dynamic analysis of multiagent Q-learning with ε-greedy exploration

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
State-coupled replicator dynamics

Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
Modelling the dynamics of multiagent Q-learning with ε-greedy exploration

Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
Globally Optimal Multi-agent Reinforcement Learning Parameters in Distributed Task Assignment

WI-IAT '09 Proceedings of the 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology - Volume 02
On a dynamical analysis of reinforcement learning in games: emergence of Occam's Razor

CEEMAS'03 Proceedings of the 3rd Central and Eastern European conference on Multi-agent systems
Frequency adjusted multi-agent Q-learning

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Evolutionary dynamics of regret minimization

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part II
Evaluating Q-learning policies for multi-objective foraging task in a multi-agent environment

ICIRA'10 Proceedings of the Third international conference on Intelligent robotics and applications - Volume Part II
Empirical and theoretical support for lenient learning

The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 3
An overview of cooperative and competitive multiagent learning

LAMAS'05 Proceedings of the First international conference on Learning and Adaption in Multi-Agent Systems
Multi-agent relational reinforcement learning

LAMAS'05 Proceedings of the First international conference on Learning and Adaption in Multi-Agent Systems
Continuous strategy replicator dynamics for multi-agent Q-learning

Autonomous Agents and Multi-Agent Systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Although well understood in the single-agent framework, the use of traditional reinforcement learning (RL) algorithms in multi-agent systems (MAS) is not always justified. The feedback an agent experiences in a MAS, is usually influenced by the other agents present in the system. Multi agent environments are therefore non-stationary and convergence and optimality guarantees of RL algorithms are lost. To better understand the dynamics of traditional RL algorithms we analyze the learning process in terms of evolutionary dynamics. More specifically we show how the Replicator Dynamics (RD) can be used as a model for Q-learning in games. The dynamical equations of Q-learning are derived and illustrated by some well chosen experiments. Both reveal an interesting connection between the exploitation-exploration scheme from RL and the selection-mutation mechanisms from evolutionary game theory.