Q-Cut - Dynamic Discovery of Sub-goals in Reinforcement Learning

  • Authors:
  • Ishai Menache;Shie Mannor;Nahum Shimkin

  • Affiliations:
  • -;-;-

  • Venue:
  • ECML '02 Proceedings of the 13th European Conference on Machine Learning
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present the Q-Cut algorithm, a graph theoretic approach for automatic detection of sub-goals in a dynamic environment, which is used for acceleration of the Q-Learning algorithm. The learning agent creates an on-line map of the process history, and uses an efficient Max-Flow/Min-Cut algorithm for identifying bottlenecks. The policies for reaching bottlenecks are separately learned and added to the model in a form of options (macro-actions). We then extend the basic Q-Cut algorithm to the Segmented Q-Cut algorithm, which uses previously identified bottlenecks for state space partitioning, necessary for finding additional bottlenecks in complex environments. Experiments showsign ificant performance improvements, particulary in the initial learning phase.