Reducing the complexity of multiagent reinforcement learning

  • Authors:
  • Andriy Burkov;Brahim Chaib-draa

  • Affiliations:
  • Laval University, Quebec, Canada;Laval University, Quebec, Canada

  • Venue:
  • Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
  • Year:
  • 2007

Quantified Score

Hi-index 0.02

Visualization

Abstract

It is known that the complexity of the reinforcement learning algorithms, such as Q-learning, may be exponential in the number of environment's states. It was shown, however, that the learning complexity for the goal-directed problems may be substantially reduced by initializing the Q-values with a "good" approximative function. In the multiagent case, there exists such a good approximation for a big class of problems, namely, for goal-directed stochastic games. These games, for example, can reflect coordination and common interest problems of cooperative robotics. The approximative function for these games is nothing but the relaxed, single-agent, problem solution, which can easily be found by each agent individually. In this article, we show that (1) an optimal single-agent solution is a "good" approximation for the goal-directed stochastic games with action-penalty representation and (b) the complexity is reduced when the learning is initialized with this approximative function, as compared to the uninformed case.