Learning to achieve socially optimal solutions in general-sum games

Authors:
Jianye Hao;Ho-fung Leung
Affiliations:
Department of Computer Science and Engineering, The Chinese University of Hong Kong, China;Department of Computer Science and Engineering, The Chinese University of Hong Kong, China
Venue:
PRICAI'12 Proceedings of the 12th Pacific Rim international conference on Trends in Artificial Intelligence
Year:
2012

Citing 10
Cited 1

Technical Note: \cal Q-Learning

Machine Learning
Multiagent learning using a variable learning rate

Artificial Intelligence
On No-Regret Learning, Fictitious Play, and Nash Equilibrium

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Towards a pareto-optimal solution in general-sum games

AAMAS '03 Proceedings of the second international joint conference on Autonomous agents and multiagent systems
Reaching pareto-optimality in prisoner's dilemma using conditional joint action learning

Autonomous Agents and Multi-Agent Systems
A few good agents: multi-agent social learning

Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
Learning-Rate Adjusting Q-Learning for Prisoner's Dilemma Games

WI-IAT '08 Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology - Volume 02
Satisficing and learning cooperation in the prisoner's dilemma

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 1
A polynomial-time Nash equilibrium algorithm for repeated games

Decision Support Systems - Special issue: The fourth ACM conference on electronic commerce
Strategy and Fairness in Repeated Two-agent Interaction

ICTAI '10 Proceedings of the 2010 22nd IEEE International Conference on Tools with Artificial Intelligence - Volume 02

Achieving Socially Optimal Outcomes in Multiagent Systems with Reinforcement Social Learning

ACM Transactions on Autonomous and Adaptive Systems (TAAS)

Quantified Score

Hi-index	0.00

Visualization

Abstract

During multi-agent interactions, robust strategies are needed to help the agents to coordinate their actions on efficient outcomes. A large body of previous work focuses on designing strategies towards the goal of Nash equilibrium under self-play, which can be extremely inefficient in many situations. On the other hand, apart from performing well under self-play, a good strategy should also be able to well respond against those opponents adopting different strategies as much as possible. In this paper, we consider a particular class of opponents whose strategies are based on best-response policy and also we target at achieving the goal of social optimality. We propose a novel learning strategy TaFSO which can effectively influence the opponent's behavior towards socially optimal outcomes by utilizing the characteristic of best-response learners. Extensive simulations show that our strategy TaFSO achieves better performance than previous work under both self-play and against the class of best-response learners.