Learning automata: an introduction
Learning automata: an introduction
Technical Note: \cal Q-Learning
Machine Learning
The dynamics of reinforcement learning in cooperative multiagent systems
AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Convergence Problems of General-Sum Multiagent Reinforcement Learning
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
A selection-mutation model for q-learning in multi-agent systems
AAMAS '03 Proceedings of the second international joint conference on Autonomous agents and multiagent systems
Cooperative Multi-Agent Learning: The State of the Art
Autonomous Agents and Multi-Agent Systems
An Evolutionary Dynamical Analysis of Multi-Agent Learning in Iterated Games
Autonomous Agents and Multi-Agent Systems
If multi-agent learning is the answer, what is the question?
Artificial Intelligence
What evolutionary game theory tells us about multiagent learning
Artificial Intelligence
State-coupled replicator dynamics
Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
Reinforcement learning: a survey
Journal of Artificial Intelligence Research
A common gradient in multi-agent reinforcement learning
Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 3
Multi-agent learning and the reinforcement gradient
EUMAS'11 Proceedings of the 9th European conference on Multi-Agent Systems
Hi-index | 0.00 |
Learning in multi-agent systems (MAS) is a complex task. Current learning theory for single-agent systems does not extend to multi-agent problems. In a MAS the reinforcement an agent receives may depend on the actions taken by the other agents present in the system. Hence, the Markov property no longer holds and convergence guarantees are lost. Currently there does not exist a general formal theory describing and elucidating the conditions under which algorithms for multi-agent learning (MAL) are successful. Therefore it is important to fully understand the dynamics of multi-agent reinforcement learning, and to be able to analyze learning behavior in terms of stability and resilience of equilibria. Recent work has considered the replicator dynamics of evolutionary game theory for this purpose. In this paper we contribute to this framework. More precisely, we formally derive the evolutionary dynamics of the Regret Minimization polynomial weights learning algorithm, which will be described by a system of differential equations. Using these equations we can easily investigate parameter settings and analyze the dynamics of multiple concurrently learning agents using regret minimization. In this way it is clear why certain attractors are stable and potentially preferred over others, and what the basins of attraction look like. Furthermore, we experimentally show that the dynamics predict the real learning behavior and we test the dynamics also in nonself play, comparing the polynomial weights algorithm against the previously derived dynamics of Q-learning and various Linear Reward algorithms in a set of benchmark normal form games.