Effective learning in the presence of adaptive counterparts

Authors:
Andriy Burkov;Brahim Chaib-draa
Affiliations:
DAMAS Laboratory, Laval University, G1K 7P4, Quebec, Canada;DAMAS Laboratory, Laval University, G1K 7P4, Quebec, Canada
Venue:
Journal of Algorithms
Year:
2009

Citing 14
Cited 0

Note on learning rate schedules for stochastic optimization

NIPS-3 Proceedings of the 1990 conference on Advances in neural information processing systems 3
Technical Note: \cal Q-Learning

Machine Learning
The dynamics of reinforcement learning in cooperative multiagent systems

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Multiagent learning using a variable learning rate

Artificial Intelligence
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Nash Convergence of Gradient Dynamics in General-Sum Games

UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
Nash q-learning for general-sum stochastic games

The Journal of Machine Learning Research
Run the GAMUT: A Comprehensive Approach to Evaluating Game-Theoretic Algorithms

AAMAS '04 Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems - Volume 2
Multiagent learning in the presence of agents with limitations

Multiagent learning in the presence of agents with limitations
Learning to compete, compromise, and cooperate in repeated general-sum games

ICML '05 Proceedings of the 22nd international conference on Machine learning
If multi-agent learning is the answer, what is the question?

Artificial Intelligence
Generalized multiagent learning with performance bound

Autonomous Agents and Multi-Agent Systems
MB-AIM-FSI: a model based framework for exploiting gradient ascent multiagent learners in strategic interactions

Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
Learning against opponents with bounded memory

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Adaptive learning algorithms (ALAs) is an important class of agents that learn the utilities of their strategies jointly with the maintenance of the beliefs about their counterparts' future actions. In this paper, we propose an approach of learning in the presence of adaptive counterparts. Our Q-learning based algorithm, called Adaptive Dynamics Learner (ADL), assigns Q-values to the fixed-length interaction histories. This makes it capable of exploiting the strategy update dynamics of the adaptive learners. By so doing, ADL usually obtains higher utilities than those of equilibrium solutions. We tested our algorithm on a substantial representative set of the most known and demonstrative matrix games. We observed that ADL is highly effective in the presence of such ALAs as Adaptive Play Q-learning, Infinitesimal Gradient Ascent, Policy Hill-Climbing and Fictitious Play Q-learning. Further, in self-play ADL usually converges to a Pareto efficient average utility.