Towards a pareto-optimal solution in general-sum games

Authors:
Sandip Sen;Stephane Airiau;Rajatish Mukherjee
Affiliations:
The University of Tulsa;The University of Tulsa;The University of Tulsa
Venue:
AAMAS '03 Proceedings of the second international joint conference on Autonomous agents and multiagent systems
Year:
2003

Citing 7
Cited 7

Technical Note: \cal Q-Learning

Machine Learning
The dynamics of reinforcement learning in cooperative multiagent systems

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Friend-or-Foe Q-learning in General-Sum Games

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Multiagent Reinforcement Learning: Theoretical Framework and an Algorithm

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Implicit Negotiation in Repeated Games

ATAL '01 Revised Papers from the 8th International Workshop on Intelligent Agents VIII
Satisficing and learning cooperation in the prisoner's dilemma

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 1
Rational and convergent learning in stochastic games

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 2

Learning to commit in repeated games

AAMAS '06 Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
Reaching pareto-optimality in prisoner's dilemma using conditional joint action learning

Autonomous Agents and Multi-Agent Systems
Utility based Q-learning to facilitate cooperation in Prisoner's Dilemma games

Web Intelligence and Agent Systems
Learning pareto-optimal solutions in 2x2 conflict games

LAMAS'05 Proceedings of the First international conference on Learning and Adaption in Multi-Agent Systems
Multi-agent relational reinforcement learning

LAMAS'05 Proceedings of the First international conference on Learning and Adaption in Multi-Agent Systems
Comparative evaluation of MAL algorithms in a diverse set of ad hoc team problems

Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Learning to achieve socially optimal solutions in general-sum games

PRICAI'12 Proceedings of the 12th Pacific Rim international conference on Trends in Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Multiagent learning literature has investigated iterated two-player games to develop mechanisms that allow agents to learn to converge on Nash Equilibrium strategy profiles. Such equilibrium configuration implies that there is no motivation for one player to change its strategy if the other does not. Often, in general sum games, a higher payoff can be obtained by both players if one chooses not to respond optimally to the other player. By developing mutual trust, agents can avoid iterated best responses that will lead to a lesser payoff Nash Equilibrium. In this paper we work with agents who select actions based on expected utility calculations that incorporates the observed frequencies of the actions of the opponent(s). We augment this stochastically-greedy agents with an interesting action revelation strategy that involves strategic revealing of one's action to avoid worst-case, pessimistic moves. We argue that in certain situations, such apparently risky revealing can indeed produce better payoff than a non-revealing approach. In particular, it is possible to obtain Pareto-optimal solutions that dominate Nash Equilibrium. We present results over a large number of randomly generated payoff matrices of varying sizes and compare the payoffs of strategically revealing learners to payoffs at Nash equilibrium.