Review: independent reinforcement learners in cooperative markov games: A survey regarding coordination problems

  • Authors:
  • Laetitia Matignon;Guillaume j. Laurent;Nadine Le fort-piat

  • Affiliations:
  • Femto-st institute, umr cnrs 6174, ufc/ensmm/utbm, 24 rue alain savary, 25000 besançon, france;Femto-st institute, umr cnrs 6174, ufc/ensmm/utbm, 24 rue alain savary, 25000 besançon, france;Femto-st institute, umr cnrs 6174, ufc/ensmm/utbm, 24 rue alain savary, 25000 besançon, france

  • Venue:
  • The Knowledge Engineering Review
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

In the framework of fully cooperative multi-agent systems, independent (non-communicative) agents that learn by reinforcement must overcome several difficulties to manage to coordinate. This paper identifies several challenges responsible for the non-coordination of independent agents: Pareto-selection, non-stationarity, stochasticity, alter-exploration and shadowed equilibria. A selection of multi-agent domains is classified according to those challenges: matrix games, Boutilier's coordination game, predators pursuit domains and a special multi-state game. Moreover, the performance of a range of algorithms for independent reinforcement learners is evaluated empirically. Those algorithms are Q-learning variants: decentralized Q-learning, distributed Q-learning, hysteretic Q-learning, recursive frequency maximum Q-value and win-or-learn fast policy hill climbing. An overview of the learning algorithms' strengths and weaknesses against each challenge concludes the paper and can serve as a basis for choosing the appropriate algorithm for a new domain. Furthermore, the distilled challenges may assist in the design of new learning algorithms that overcome these problems and achieve higher performance in multi-agent applications.