Reinforcement learning with hierarchies of machines
NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning
Artificial Intelligence
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Introduction to Algorithms
Q-Cut - Dynamic Discovery of Sub-goals in Reinforcement Learning
ECML '02 Proceedings of the 13th European Conference on Machine Learning
Automatic Discovery of Subgoals in Reinforcement Learning using Diverse Density
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Learning Options in Reinforcement Learning
Proceedings of the 5th International Symposium on Abstraction, Reformulation and Approximation
Using relative novelty to identify useful temporal abstractions in reinforcement learning
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Dynamic abstraction in reinforcement learning via clustering
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Identifying useful subgoals in reinforcement learning by local graph partitioning
ICML '05 Proceedings of the 22nd international conference on Machine learning
Connect-Based Subgoal Discovery for Options in Hierarchical Reinforcement Learning
ICNC '07 Proceedings of the Third International Conference on Natural Computation - Volume 04
Hierarchical reinforcement learning with the MAXQ value function decomposition
Journal of Artificial Intelligence Research
Hi-index | 0.00 |
Hierarchical reinforcement learning (HRL) has had a vast range of applications in recent years. Preparing mechanisms for autonomous acquisition of skills has been a main topic of research in this area. While different methods have been proposed to achieve this goal, few methods have been shown to be successful both in performance and also efficiency in terms of time complexity of the algorithm. In this paper, a linear time algorithm is proposed to find subgoal states of the environment in early episodes of learning. Having subgoals available in early phases of a learning task, results in building skills that dramatically increase the convergence rate of the learning process.