The dynamics of reinforcement learning in cooperative multiagent systems
AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Multiagent learning using a variable learning rate
Artificial Intelligence
Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Nash Convergence of Gradient Dynamics in General-Sum Games
UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
Fast concurrent reinforcement learners
IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2
Adaptive policy gradient in multiagent learning
AAMAS '03 Proceedings of the second international joint conference on Autonomous agents and multiagent systems
ICANN '09 Proceedings of the 19th International Conference on Artificial Neural Networks: Part I
Exploiting based pre-testing in competition environment
PRIMA'06 Proceedings of the 9th Pacific Rim international conference on Agent Computing and Multi-Agent Systems
Hi-index | 0.00 |
In this work we look at the recent results in policy gradient learning in a general-sum game scenario, in the form of two algorithms, IGA and WoLF-IGA. We address the drawbacks in convergence properties of these algorithms, and propose a more accurate version of WoLF-IGA that is guaranteed to converge to Nash Equilibrium policies in self-play (or against an IGA learner). We also present a control theoretic interpretation of variable learning rate which not only justifies WoLF-IGA, but also shows it to achieve fastest convergence under some constraints. Finally we derive optimal learning rates for fastest convergence in practical simulations.