Acceleration of game learning with prediction-based reinforcement learning: toward the emergence of planning behavior

  • Authors:
  • Yu Ohigashi;Takashi Omori;Koji Morikawa;Natsuki Oka

  • Affiliations:
  • Graduate School of Engineering, Hokkaido University, Sapporo, Hokkaido, Japan;Graduate School of Engineering, Hokkaido University, Sapporo, Hokkaido, Japan;Humanware Technology Research Laboratory, Matsushita Electric Industrial Co., Ltd., Kyoto, Japan;Humanware Technology Research Laboratory, Matsushita Electric Industrial Co., Ltd., Kyoto, Japan

  • Venue:
  • ICANN/ICONIP'03 Proceedings of the 2003 joint international conference on Artificial neural networks and neural information processing
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

When humans solve a problem, it is unlikely that they use only the current state of the problem to decide upon an action. It is difficult to explain the human action decision strategy by means of the state to action model, which is the major method used in conventional reinforcement learning (RL). On the contrary, humans appear to predict a future state through the use of past experience and decide upon an action based on that predicted state. In this paper, we propose a prediction-based RL model (PRLmodel). In the PRL model, a state prediction module and an action memory module are added to an actor-critic type RL, and the system predicts and evaluates a future state from a current one based on an expected value table. Then, the system chooses a point of action decision in order to perform the appropriate action. To evaluate the proposed model, we perform a computer simulation using a simple ping pong game. We also discuss the possibility that the PRL model may represent an evolutional change in conventional RL as well as a step toward modeling of hmuan planning behavior, because state prediction and its evaluation are the basic elements of planning in symbolic AI.