Combining Exploitation-Based and Exploration-Based Approach in Reinforcement Learning

Authors:
Kazunori Iwata;Nobuhiro Ito;Koichiro Yamauchi;Naohiro Ishii
Affiliations:
-;-;-;-
Venue:
IDEAL '00 Proceedings of the Second International Conference on Intelligent Data Engineering and Automated Learning, Data Mining, Financial Engineering, and Intelligent Agents
Year:
2000

Citing 6
Cited 0

Learning to Perceive and Act by Trial and Error

Machine Learning
Technical Note: \cal Q-Learning

Machine Learning
Machine Learning

Machine Learning
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
Temporal credit assignment in reinforcement learning

Temporal credit assignment in reinforcement learning
Reinforcement learning: a survey

Journal of Artificial Intelligence Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

Watkins' Q-learning is the most popular and an effective model-free method. However, comparing model-based approach, Q-learning with various exploration strategies require a large number of trial-and-error interactions for finding an optimal policy. To overcome this drawback, we propose a new model-based learning method extending Q-learning. This method has separated EI and ER functions for learning exploitation-based and exploration-based model, respectively. EI function based on statistics indicates the best action. The another ER function based on the information of exploration leads the learner to well-unknown region in the global state space by backing up in each step. Then, we introduce a new criterion as the information of exploration. Using combined these function, we can effectively proceed exploitation and exploration strategies and can select an action which considers each strategy simultaneously.