Non-linear dynamics in multiagent reinforcement learning algorithms

Authors:
Sherief Abdallah;Victor Lesser
Affiliations:
British University, Dubai, United Arab Emirates;University of Massachusetts, Amherst, MA
Venue:
Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 3
Year:
2008

Citing 3
Cited 2

Multiagent learning using a variable learning rate

Artificial Intelligence
Nash Convergence of Gradient Dynamics in General-Sum Games

UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
Learning the task allocation game

AAMAS '06 Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems

Dynamic analysis of multiagent Q-learning with ε-greedy exploration

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
A multiagent reinforcement learning algorithm with non-linear dynamics

Journal of Artificial Intelligence Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

Several multiagent reinforcement learning (MARL) algorithms have been proposed to optimize agents' decisions. Only a subset of these MARL algorithms both do not require agents to know the underlying environment and can learn a stochastic policy (a policy that chooses actions according to a probability distribution). Weighted Policy Learner (WPL) is a MARL algorithm that belongs to this subset and was shown, experimentally in previous work, to converge and outperform previous MARL algorithms belonging to the same subset. The main contribution of this paper is analyzing the dynamics of WPL and showing the effect of its non-linear nature, as opposed to previous MARL algorithms that had linear dynamics. First, we represent the WPL algorithm as a set of differential equations. We then solve the equations and show that it is consistent with experimental results reported in previous work. We finally compare the dynamics of WPL with earlier MARL algorithms and discuss the interesting differences and similarities we have discovered.