A comparison of learning performance in two-dimensional Q-learning by the difference of Q-values alignment

  • Authors:
  • Kathy Thi Aung;Takayasu Fuchida

  • Affiliations:
  • Department of Information and Computer Science, Graduate School of Science and Engineering, Kagoshima University, Kagoshima, Japan 890-0065;Department of Information and Computer Science, Graduate School of Science and Engineering, Kagoshima University, Kagoshima, Japan 890-0065

  • Venue:
  • Artificial Life and Robotics
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this article, we examine the learning performance of various strategies under different conditions using the Voronoi Q-value element (VQE) based on reward in a single-agent environment, and decide how to act in a certain state. In order to test our hypotheses, we performed computational experiments using several situations such as various angles of rotation of VQEs which are arranged into a lattice structure, various angles of an agent's action rotation that has 4 actions, and a random arrangement of VQEs to correctly evaluate the optimal Q-values for state and action pairs in order to deal with continuous-valued inputs. As a result, the learning performance changes when the angle of VQEs and the angle of action are changed by a specific relative position.