Learning to compete, compromise, and cooperate in repeated general-sum games

Authors:
Jacob W. Crandall;Michael A. Goodrich
Affiliations:
Brigham Young University, Provo, UT;Brigham Young University, Provo, UT
Venue:
ICML '05 Proceedings of the 22nd international conference on Machine learning
Year:
2005

Citing 6
Cited 9

Multiagent learning using a variable learning rate

Artificial Intelligence
Reinforcement Learning

Reinforcement Learning
Friend-or-Foe Q-learning in General-Sum Games

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
A polynomial-time nash equilibrium algorithm for repeated games

Proceedings of the 4th ACM conference on Electronic commerce
Satisficing and learning cooperation in the prisoner's dilemma

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 1

Learning from induced changes in opponent (re)actions in multi-agent games

AAMAS '06 Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
Learning to cooperate in multi-agent social dilemmas

AAMAS '06 Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
Reaching pareto-optimality in prisoner's dilemma using conditional joint action learning

Autonomous Agents and Multi-Agent Systems
On the use of memory and resources in minority games

ACM Transactions on Autonomous and Adaptive Systems (TAAS)
Strategic Foresighted Learning in Competitive Multi-Agent Games

Proceedings of the 2006 conference on ECAI 2006: 17th European Conference on Artificial Intelligence August 29 -- September 1, 2006, Riva del Garda, Italy
Achieving cooperation in a minimally constrained environment

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 1
An information-theoretic analysis of memory bounds in a distributed resource allocation mechanism

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Effective learning in the presence of adaptive counterparts

Journal of Algorithms
Anytime Self-play Learning to Satisfy Functional Optimality Criteria

ADT '09 Proceedings of the 1st International Conference on Algorithmic Decision Theory

Quantified Score

Hi-index	0.00

Visualization

Abstract

Learning algorithms often obtain relatively low average payoffs in repeated general-sum games between other learning agents due to a focus on myopic best-response and one-shot Nash equilibrium (NE) strategies. A less myopic approach places focus on NEs of the repeated game, which suggests that (at the least) a learning agent should possess two properties. First, an agent should never learn to play a strategy that produces average payoffs less than the minimax value of the game. Second, an agent should learn to cooperate/compromise when beneficial. No learning algorithm from the literature is known to possess both of these properties. We present a reinforcement learning algorithm (M-Qubed) that provably satisfies the first property and empirically displays (in self play) the second property in a wide range of games.