An adaptive policy gradient in learning Nash equilibria

Authors:
Huaxiang Zhang;Ying Fan
Affiliations:
Department of Computer Science, Shandong Normal University, Jinan, 250014 Shandong, China;Department of Computer Science, Shandong Normal University, Jinan, 250014 Shandong, China
Venue:
Neurocomputing
Year:
2008

Citing 10
Cited 0

On the alleviation of the problem of local minima in back-propagation

Proceedings of second world congress on Nonlinear analysts
The dynamics of reinforcement learning in cooperative multiagent systems

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Swarm intelligence

Swarm intelligence
Multiagent learning using a variable learning rate

Artificial Intelligence
Learning to Reach the Pareto Optimal Nash Equilibrium as a Team

AI '02 Proceedings of the 15th Australian Joint Conference on Artificial Intelligence: Advances in Artificial Intelligence
Nash Convergence of Gradient Dynamics in General-Sum Games

UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
Nash q-learning for general-sum stochastic games

The Journal of Machine Learning Research
Computing Nash equilibria through computational intelligence methods

Journal of Computational and Applied Mathematics - Special issue: Selected papers of the international conference on computational methods in sciences and engineering (ICCMSE-2003)
Coordinated exploration in multi-agent reinforcement learning: an application to load-balancing

Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems
Completely Derandomized Self-Adaptation in Evolution Strategies

Evolutionary Computation

Quantified Score

Hi-index	0.01

Visualization

Abstract

A novel Nash equilibria (NE) learning algorithm for finite strategic games is presented in this paper. Based on an assumption that each player tries to maximize his own payoff, the algorithm explores the policies of the players in the policy profile space to increase the payoffs in each learning iteration. This paper investigates the effectiveness of the algorithm and show experimentally that the proposed algorithm accelerates the policy learning process. This algorithm learns faster than other proposed intelligent learning approaches, and can learn almost all the existing NE for a finite strategic game.