Multiagent learning using a variable learning rate
Artificial Intelligence
Reinforcement Learning
Friend-or-Foe Q-learning in General-Sum Games
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
A polynomial-time nash equilibrium algorithm for repeated games
Proceedings of the 4th ACM conference on Electronic commerce
Satisficing and learning cooperation in the prisoner's dilemma
IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 1
Learning from induced changes in opponent (re)actions in multi-agent games
AAMAS '06 Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
Learning to cooperate in multi-agent social dilemmas
AAMAS '06 Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
Reaching pareto-optimality in prisoner's dilemma using conditional joint action learning
Autonomous Agents and Multi-Agent Systems
On the use of memory and resources in minority games
ACM Transactions on Autonomous and Adaptive Systems (TAAS)
Strategic Foresighted Learning in Competitive Multi-Agent Games
Proceedings of the 2006 conference on ECAI 2006: 17th European Conference on Artificial Intelligence August 29 -- September 1, 2006, Riva del Garda, Italy
Achieving cooperation in a minimally constrained environment
AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 1
An information-theoretic analysis of memory bounds in a distributed resource allocation mechanism
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Effective learning in the presence of adaptive counterparts
Journal of Algorithms
Anytime Self-play Learning to Satisfy Functional Optimality Criteria
ADT '09 Proceedings of the 1st International Conference on Algorithmic Decision Theory
Hi-index | 0.00 |
Learning algorithms often obtain relatively low average payoffs in repeated general-sum games between other learning agents due to a focus on myopic best-response and one-shot Nash equilibrium (NE) strategies. A less myopic approach places focus on NEs of the repeated game, which suggests that (at the least) a learning agent should possess two properties. First, an agent should never learn to play a strategy that produces average payoffs less than the minimax value of the game. Second, an agent should learn to cooperate/compromise when beneficial. No learning algorithm from the literature is known to possess both of these properties. We present a reinforcement learning algorithm (M-Qubed) that provably satisfies the first property and empirically displays (in self play) the second property in a wide range of games.