Technical Note: \cal Q-Learning
Machine Learning
The dynamics of reinforcement learning in cooperative multiagent systems
AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
An Algorithm for Distributed Reinforcement Learning in Cooperative Multi-Agent Systems
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Nash Convergence of Gradient Dynamics in General-Sum Games
UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
Reinforcement learning of coordination in cooperative multi-agent systems
Eighteenth national conference on Artificial intelligence
A selection-mutation model for q-learning in multi-agent systems
AAMAS '03 Proceedings of the second international joint conference on Autonomous agents and multiagent systems
An analysis of cooperative coevolutionary algorithms
An analysis of cooperative coevolutionary algorithms
An Evolutionary Dynamical Analysis of Multi-Agent Learning in Iterated Games
Autonomous Agents and Multi-Agent Systems
Switching dynamics of multi-agent learning
Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
Multi-agent Learning Dynamics: A Survey
CIA '07 Proceedings of the 11th international workshop on Cooperative Information Agents XI
A proximate dynamics model for data mining
Expert Systems with Applications: An International Journal
Hi-index | 0.00 |
This paper presents the dynamics of multiple reinforcement learning agents from an Evolutionary Game Theoretic (EGT) perspective. We provide a Replicator Dynamics model for traditional multiagent Q-learning, and we extend these differential equations to account for lenient learners: agents that forgive possible mistakes of their teammates that resulted in lower rewards. We use this extended formal model to visualize the basins of attraction of both traditional and lenient multiagent Q-learners in two benchmark coordination problems. The results indicate that lenience provides learners with more accurate estimates for the utility of their actions, resulting in higher likelihood of convergence to the globally optimal solution. In addition, our research supports the strength of EGT as a backbone for multiagent reinforcement learning.