Finding hidden hierarchy in reinforcement learning

  • Authors:
  • Geoff Poulton;Ying Guo;Wen Lu

  • Affiliations:
  • Autonomous Systems, Information and Communication Technology Centre, CSIRO, Epping, Australia;Autonomous Systems, Information and Communication Technology Centre, CSIRO, Epping, Australia;University of NSW, Australia

  • Venue:
  • KES'05 Proceedings of the 9th international conference on Knowledge-Based Intelligent Information and Engineering Systems - Volume Part III
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

HEXQ is a reinforcement learning algorithm that decomposes a problem into subtasks and constructs a hierarchy using state variables. The maximum number of levels is constrained by the number of variables representing a state. In HEXQ, values learned for a subtask can be reused in different contexts if the subtasks are identical. If not, values for non-identical subtasks need to be trained separately. This paper introduces a method that tackles these two restrictions. Experimental results show that this method can save the training time dramatically.