Cooperative/Competitive Behavior Acquisition Based on State Value Estimation of Others

  • Authors:
  • Kentaro Noma;Yasutake Takahashi;Minoru Asada

  • Affiliations:
  • Dept. of Adaptive Machine Systems, Graduate School of Engineering, Osaka University,;Dept. of Adaptive Machine Systems, Graduate School of Engineering, Osaka University,;Dept. of Adaptive Machine Systems, Graduate School of Engineering, Osaka University, and JST ERATO Asada Synergistic Intelligence Project, Osaka, Japan 565-0871

  • Venue:
  • RoboCup 2007: Robot Soccer World Cup XI
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

The existing reinforcement learning approaches have been suffering from the curse of dimension problem when they are applied to multiagent dynamic environments. One of the typical examples is a case of RoboCup competition since other agents and their behaviors easily cause state and action space explosion. This paper presents a method of hierarchical modular learning in a multiagent environment by which the learning agent can acquire cooperative behaviors with its teammates and competitive ones against its opponents. The key ideas to resolve the issue are as follows. First, a two-layer hierarchical system with multi learning modules is adopted to reduce the size of the state and action spaces. The state space of the top layer consists of the state values from the lower level, and the macro actions are used to reduce the size of the action space. Second, the state of the other to what extent it is close to its own goal is estimated by observation and used as a state value in the top layer state space to realize the cooperative/competitive behaviors. The method is applied to 4 (defence team) on 5 (offence team) game task, and the learning agent successfully acquired the teamwork plays (pass and shoot) within much shorter learning time (30 times quicker than the earlier work).