Exploration strategies in n-Person general-sum multiagent reinforcement learning with sequential action selection

  • Authors:
  • Ali Akramizadeh;Ahmad Afshar;Mohammad B. Menhaj

  • Affiliations:
  • Department of Electrical Engineering, Amir Kabir University of Technology, Tehran, Iran;Department of Electrical Engineering, Amir Kabir University of Technology, Tehran, Iran;Department of Electrical Engineering, Amir Kabir University of Technology, Tehran, Iran

  • Venue:
  • Intelligent Data Analysis
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, two novel exploration strategies are proposed for n-person general-sum multiagent reinforcement learning with sequential action selection. The existing learning process, called extensive Markov game, is considered as a set of successive extensive form games with perfect information. We introduce an estimated value for taking actions in games with respect to other agents' preferences which is called associative Q-value. They can be used to select actions probabilistically according to Boltzmann distribution. Simulation results present the effectiveness of the proposed exploration strategies that are used in our previously introduced extensive-Q learning methods. Regarding the complexity of existing methods of computing Nash equilibrium points, if it is possible to assume sequential action selection among agents, extensive-Q will be more convenient for dynamic task multiagent systems with more than two agents.