Extraction of reward-related feature space using correlation-based and reward-based learning methods

  • Authors:
  • Poramate Manoonpong;Florentin Wörgötter;Jun Morimoto

  • Affiliations:
  • ATR Computational Neuroscience Laboratories, Kyoto, Japan and Bernstein Center for Computational Neuroscience, III. Institute of Physics, University of Göttingen, Göttingen, Germany;Bernstein Center for Computational Neuroscience, III. Institute of Physics, University of Göttingen, Göttingen, Germany;ATR Computational Neuroscience Laboratories, Kyoto, Japan

  • Venue:
  • ICONIP'10 Proceedings of the 17th international conference on Neural information processing: theory and algorithms - Volume Part I
  • Year:
  • 2010

Quantified Score

Hi-index 0.01

Visualization

Abstract

The purpose of this article is to present a novel learning paradigm that extracts reward-related low-dimensional state space by combining correlation-based learning like Input Correlation Learning (ICO learning) and reward-based learning like Reinforcement Learning (RL). Since ICO learning can quickly find a correlation between a state and an unwanted condition (e.g., failure), we use it to extract low-dimensional feature space in which we can find a failure avoidance policy. Then, the extracted feature space is used as a prior for RL. If we can extract proper feature space for a given task, a model of the policy can be simple and the policy can be easily improved. The performance of this learning paradigm is evaluated through simulation of a cartpole system. As a result, we show that the proposed method can enhance the feature extraction process to find the proper feature space for a pole balancing policy. That is it allows a policy to effectively stabilize the pole in the largest domain of initial conditions compared to only using ICO learning or only using RL without any prior knowledge.