Approximate Q-Learning: An Introduction

  • Authors:
  • Deepshikha Pandey;Punit Pandey

  • Affiliations:
  • -;-

  • Venue:
  • ICMLC '10 Proceedings of the 2010 Second International Conference on Machine Learning and Computing
  • Year:
  • 2010

Quantified Score

Hi-index 0.01

Visualization

Abstract

This paper introduces an approach to Q-learning algorithm with rough set theory introduced by Zdzislaw Pawlak in 1981. During Q-learning, an agent makes action selections in an effort to maximize a reward signal obtained from the environment. Based on reward, agent will make changes in its policy for future actions. The problem considered in this paper is the overestimation of expected value of cumulative future discounted rewards. This discounted reward is used in evaluating agent actions and policy during reinforcement learning. Due to the overestimation of discounted reward action evaluation and policy changes are not accurate. The solution to this problem results from a form Q-learning algorithm using a combination of approximation spaces and Q-learning to estimate the expected value of returns on actions. This is made possible by considering behavior patterns of an agent in scope of approximation spaces. The framework provided by an approximation space makes it possible to measure the degree that agent behaviors are a part of (''covered by'') a set of accepted agent behaviors that serve as a behavior evaluation norm.