Learning to solve problems by searching for macro-operators
Learning to solve problems by searching for macro-operators
Reinforcement learning with hierarchies of machines
NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning
Artificial Intelligence
Markov Decision Processes: Discrete Stochastic Dynamic Programming
Markov Decision Processes: Discrete Stochastic Dynamic Programming
Learning Search Control Knowledge: An Explanation-Based Approach
Learning Search Control Knowledge: An Explanation-Based Approach
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
A Heuristic Approach to the Discovery of Macro-Operators
Machine Learning
Chunking in Soar: The Anatomy of a General Learning Mechanism
Machine Learning
Automatic Discovery of Subgoals in Reinforcement Learning using Diverse Density
ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
The MAXQ Method for Hierarchical Reinforcement Learning
ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Hierarchical control and learning for markov decision processes
Hierarchical control and learning for markov decision processes
Temporal abstraction in reinforcement learning
Temporal abstraction in reinforcement learning
Autonomous discovery of temporal abstractions from interaction with an environment
Autonomous discovery of temporal abstractions from interaction with an environment
Human Problem Solving
A layered approach to learning coordination knowledge in multiagent environments
Applied Intelligence
Constraint relaxation in approximate linear programs
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
ISNN '09 Proceedings of the 6th International Symposium on Neural Networks on Advances in Neural Networks
Learning by Automatic Option Discovery from Conditionally Terminating Sequences
Proceedings of the 2006 conference on ECAI 2006: 17th European Conference on Artificial Intelligence August 29 -- September 1, 2006, Riva del Garda, Italy
State similarity based approach for improving performance in RL
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Finding and transferring policies using stored behaviors
Autonomous Robots
Efficient planning under uncertainty with macro-actions
Journal of Artificial Intelligence Research
Automatic discovery of subgoals based on improved FCM clustering
AICI'11 Proceedings of the Third international conference on Artificial intelligence and computational intelligence - Volume Part II
Effectiveness of considering state similarity for reinforcement learning
IDEAL'06 Proceedings of the 7th international conference on Intelligent Data Engineering and Automated Learning
Automatic construction of temporally extended actions for MDPs using bisimulation metrics
EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
Automatic task decomposition and state abstraction from demonstration
Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Beyond reward: the problem of knowledge and data
ILP'11 Proceedings of the 21st international conference on Inductive Logic Programming
Abstraction in Model Based Partially Observable Reinforcement Learning Using Extended Sequence Trees
WI-IAT '12 Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 02
Organizational design principles and techniques for decision-theoretic agents
Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
Object focused q-learning for autonomous agents
Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
Hi-index | 0.00 |
Temporally extended actions (e.g., macro actions) have proven very useful for speeding up learning, ensuring robustness and building prior knowledge into AI systems. The options framework (Precup, 2000; Sutton, Precup & Singh, 1999) provides a natural way of incorporating such actions into reinforcement learning systems, but leaves open the issue of howgood options might be identified. In this paper, we empirically explore a simple approach to creating options. The underlying assumption is that the agent will be asked to perform different goalachievement tasks in an environment that is otherwise the same over time. Our approach is based on the intuition that states that are frequently visited on system trajectories, could prove to be useful subgoals (e.g., McGovern & Barto, 2001; Iba, 1989).We propose a greedy algorithm for identifying subgoals based on state visitation counts. We present empirical studies of this approach in two gridworld navigation tasks. One of the environments we explored contains bottleneck states, and the algorithm indeed finds these states, as expected. The second environment is an empty gridworld with no obstacles. Although the environment does not contain any obvious subgoals, our approach still finds useful options, which essentially allow the agent to explore the environment more quickly.