A study of Q-learning considering negative rewards

Authors:
Takayasu Fuchida;Kathy Thi Aung;Atsushi Sakuragi
Affiliations:
Graduate School of Science and Engineering, Kagoshima University, Kagoshima, Japan 890-0065;Graduate School of Science and Engineering, Kagoshima University, Kagoshima, Japan 890-0065;Graduate School of Science and Engineering, Kagoshima University, Kagoshima, Japan 890-0065
Venue:
Artificial Life and Robotics
Year:
2010

Citing 2
Cited 0

Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Reinforcement learning: a survey

Journal of Artificial Intelligence Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

In the reinforcement learning system, the agent obtains a positive reward, such as 1, when it achieves its goal. Positive rewards are propagated around the goal area, and the agent gradually succeeds in reaching its goal. If you want to avoid certain situations, such as dangerous places or poison, you might want to give a negative reward to the agent. However, in conventional Q-learning, negative rewards are not propagated in more than one state. In this article, we propose a new way to propagate negative rewards. This is a very simple and efficient technique for Q-learning. Finally, we show the results of computer simulations and the effectiveness of the proposed method.