Two Dimensional Evaluation Reinforcement Learning

Authors:
Hiroyuki Okada;Hiroshi Yamakawa;Takashi Omori
Affiliations:
-;-;-
Venue:
IWANN '01 Proceedings of the 6th International Work-Conference on Artificial and Natural Neural Networks: Connectionist Models of Neurons, Learning Processes and Artificial Intelligence-Part I
Year:
2001

Citing 1
Cited 0

Reinforcement learning: a survey

Journal of Artificial Intelligence Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

To solve the problem of tradeoff between exploration and exploitation actions in reinforcement learning, the authors have proposed two-dimensional evaluation reinforcement learning, which distinguishes between reward and punishment evaluation forecasts. The proposed method use these difference between reward evaluation and punishment evaluation as a factor for determining the action and the sum as a parameter for determining the ratio of exploration to exploitation. In this paper we described an experiment with a mobile robot searching for a path and the subsequent conflict between exploration and exploitation actions. The results of the experiment pro ve that using the proposed method of reinforcement learning using the two dimensions of reward and punishment can generate a better path than using the conventional reinforcement learning method.