Combining Exploitation-Based and Exploration-Based Approach in Reinforcement Learning

  • Authors:
  • Kazunori Iwata;Nobuhiro Ito;Koichiro Yamauchi;Naohiro Ishii

  • Affiliations:
  • -;-;-;-

  • Venue:
  • IDEAL '00 Proceedings of the Second International Conference on Intelligent Data Engineering and Automated Learning, Data Mining, Financial Engineering, and Intelligent Agents
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

Watkins' Q-learning is the most popular and an effective model-free method. However, comparing model-based approach, Q-learning with various exploration strategies require a large number of trial-and-error interactions for finding an optimal policy. To overcome this drawback, we propose a new model-based learning method extending Q-learning. This method has separated EI and ER functions for learning exploitation-based and exploration-based model, respectively. EI function based on statistics indicates the best action. The another ER function based on the information of exploration leads the learner to well-unknown region in the global state space by backing up in each step. Then, we introduce a new criterion as the information of exploration. Using combined these function, we can effectively proceed exploitation and exploration strategies and can select an action which considers each strategy simultaneously.