A state-cluster based Q-learning

  • Authors:
  • Zhao Jin;WeiYi Liu;Jian Jin

  • Affiliations:
  • School of Information Science and Engineering, Yunnan University, Kunming, P.R. China;School of Information Science and Engineering, Yunnan University, Kunming, P.R. China;Hongta Group Tobacco Limited Corporation, Yuxi, P.R. China

  • Venue:
  • ICNC'09 Proceedings of the 5th international conference on Natural computation
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

When apply Q-learning to complex real-world problems, the learning process is long enough to make this method unpractical. The major cause is Q-learning requires the agent to visit every state-action transition infinitely often for making Q value convergent. We propose a State-Cluster based Q-learning method to accelerate convergence and shorten learning process. This method creates the State-Cluster for each state the agent reached according to the state trajectory that the agent wandered. By our algorithm, the State-Cluster of a state would hold these acyclic shortest state paths from other states to this state. When a state's Q value is refined in one step of the agent, the refined Q value can be propagated immediately back to all these states in its State-Cluster along the state paths between them, instead of requiring the agent to visit these states again. With the State-Cluster, more Q value can be refined in one step of the agent, which speeds up the convergence of Q value. The experiments compared with Q-learning demonstrate this method is extraordinarily more effective. This method is aimed Q-learning, but it is also applicable for most other reinforcement learning methods based value function iteration.