Machine Learning
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Recent Advances in Hierarchical Reinforcement Learning
Discrete Event Dynamic Systems
Q-Cut - Dynamic Discovery of Sub-goals in Reinforcement Learning
ECML '02 Proceedings of the 13th European Conference on Machine Learning
Automatic Discovery of Subgoals in Reinforcement Learning using Diverse Density
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Connect-Based Subgoal Discovery for Options in Hierarchical Reinforcement Learning
ICNC '07 Proceedings of the Third International Conference on Natural Computation - Volume 04
Cascading Decomposition and State Abstractions for Reinforcement Learning
MICAI '08 Proceedings of the 2008 Seventh Mexican International Conference on Artificial Intelligence
Automatic Discovery of Subgoals in Reinforcement Learning Using Unique-Dreiction Value
COGINF '07 Proceedings of the 6th IEEE International Conference on Cognitive Informatics
Hierarchical reinforcement learning with the MAXQ value function decomposition
Journal of Artificial Intelligence Research
Reinforcement learning: a survey
Journal of Artificial Intelligence Research
Automatic discovery of subgoals in reinforcement learning using strongly connected components
ICONIP'08 Proceedings of the 15th international conference on Advances in neuro-information processing - Volume Part I
Improving reinforcement learning by using sequence trees
Machine Learning
Learning form experience: a bayesian network based reinforcement learning approach
ICICA'11 Proceedings of the Second international conference on Information Computing and Applications
Hi-index | 0.00 |
Divide and rule is an effective strategy to solve large and complex problems. We propose an approach to make agent can discover autonomously subgoals for task decomposition to accelerate reinforcement learning. We remove the state loops in the state trajectories to get the shortest distance of every state from the goal state, then these states in acyclic state trajectories are arranged in different layers according to the shortest distance of them from the goal state. So, to reach these state layers with different distance to the goal state can be used as the subgoals for agent reaching the goal state eventually. Compared with others, autonomy and robustness are the major advantages of our approach. The experiments on Grid-World problem show the applicability, effectiveness and robustness of our approach.