Note on learning rate schedules for stochastic optimization
NIPS-3 Proceedings of the 1990 conference on Advances in neural information processing systems 3
Technical Note: \cal Q-Learning
Machine Learning
The dynamics of reinforcement learning in cooperative multiagent systems
AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Multiagent learning using a variable learning rate
Artificial Intelligence
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Nash Convergence of Gradient Dynamics in General-Sum Games
UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
Nash q-learning for general-sum stochastic games
The Journal of Machine Learning Research
Run the GAMUT: A Comprehensive Approach to Evaluating Game-Theoretic Algorithms
AAMAS '04 Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems - Volume 2
Multiagent learning in the presence of agents with limitations
Multiagent learning in the presence of agents with limitations
Learning to compete, compromise, and cooperate in repeated general-sum games
ICML '05 Proceedings of the 22nd international conference on Machine learning
If multi-agent learning is the answer, what is the question?
Artificial Intelligence
Generalized multiagent learning with performance bound
Autonomous Agents and Multi-Agent Systems
Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
Learning against opponents with bounded memory
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Hi-index | 0.00 |
Adaptive learning algorithms (ALAs) is an important class of agents that learn the utilities of their strategies jointly with the maintenance of the beliefs about their counterparts' future actions. In this paper, we propose an approach of learning in the presence of adaptive counterparts. Our Q-learning based algorithm, called Adaptive Dynamics Learner (ADL), assigns Q-values to the fixed-length interaction histories. This makes it capable of exploiting the strategy update dynamics of the adaptive learners. By so doing, ADL usually obtains higher utilities than those of equilibrium solutions. We tested our algorithm on a substantial representative set of the most known and demonstrative matrix games. We observed that ADL is highly effective in the presence of such ALAs as Adaptive Play Q-learning, Infinitesimal Gradient Ascent, Policy Hill-Climbing and Fictitious Play Q-learning. Further, in self-play ADL usually converges to a Pareto efficient average utility.