Two Dimensional Evaluation Reinforcement Learning

  • Authors:
  • Hiroyuki Okada;Hiroshi Yamakawa;Takashi Omori

  • Affiliations:
  • -;-;-

  • Venue:
  • IWANN '01 Proceedings of the 6th International Work-Conference on Artificial and Natural Neural Networks: Connectionist Models of Neurons, Learning Processes and Artificial Intelligence-Part I
  • Year:
  • 2001

Quantified Score

Hi-index 0.00

Visualization

Abstract

To solve the problem of tradeoff between exploration and exploitation actions in reinforcement learning, the authors have proposed two-dimensional evaluation reinforcement learning, which distinguishes between reward and punishment evaluation forecasts. The proposed method use these difference between reward evaluation and punishment evaluation as a factor for determining the action and the sum as a parameter for determining the ratio of exploration to exploitation. In this paper we described an experiment with a mobile robot searching for a path and the subsequent conflict between exploration and exploitation actions. The results of the experiment pro ve that using the proposed method of reinforcement learning using the two dimensions of reward and punishment can generate a better path than using the conventional reinforcement learning method.