Learning Options in Reinforcement Learning

Authors:
Martin Stolle;Doina Precup
Affiliations:
-;-
Venue:
Proceedings of the 5th International Symposium on Abstraction, Reformulation and Approximation
Year:
2002

Citing 14
Cited 15

Learning to solve problems by searching for macro-operators

Learning to solve problems by searching for macro-operators
Reinforcement learning with hierarchies of machines

NIPS '97 Proceedings of the 1997 conference on Advances in neural information processing systems 10
Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning

Artificial Intelligence
Markov Decision Processes: Discrete Stochastic Dynamic Programming

Markov Decision Processes: Discrete Stochastic Dynamic Programming
Learning Search Control Knowledge: An Explanation-Based Approach

Learning Search Control Knowledge: An Explanation-Based Approach
Introduction to Reinforcement Learning

Introduction to Reinforcement Learning
A Heuristic Approach to the Discovery of Macro-Operators

Machine Learning
Chunking in Soar: The Anatomy of a General Learning Mechanism

Machine Learning
Automatic Discovery of Subgoals in Reinforcement Learning using Diverse Density

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
The MAXQ Method for Hierarchical Reinforcement Learning

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Hierarchical control and learning for markov decision processes

Hierarchical control and learning for markov decision processes
Temporal abstraction in reinforcement learning

Temporal abstraction in reinforcement learning
Autonomous discovery of temporal abstractions from interaction with an environment

Autonomous discovery of temporal abstractions from interaction with an environment
Human Problem Solving

Human Problem Solving

A layered approach to learning coordination knowledge in multiagent environments

Applied Intelligence
Constraint relaxation in approximate linear programs

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Using Strongly Connected Components as a Basis for Autonomous Skill Acquisition in Reinforcement Learning

ISNN '09 Proceedings of the 6th International Symposium on Neural Networks on Advances in Neural Networks
Learning by Automatic Option Discovery from Conditionally Terminating Sequences

Proceedings of the 2006 conference on ECAI 2006: 17th European Conference on Artificial Intelligence August 29 -- September 1, 2006, Riva del Garda, Italy
State similarity based approach for improving performance in RL

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Finding and transferring policies using stored behaviors

Autonomous Robots
Efficient planning under uncertainty with macro-actions

Journal of Artificial Intelligence Research
Automatic discovery of subgoals based on improved FCM clustering

AICI'11 Proceedings of the Third international conference on Artificial intelligence and computational intelligence - Volume Part II
Effectiveness of considering state similarity for reinforcement learning

IDEAL'06 Proceedings of the 7th international conference on Intelligent Data Engineering and Automated Learning
Automatic construction of temporally extended actions for MDPs using bisimulation metrics

EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
Automatic task decomposition and state abstraction from demonstration

Proceedings of the 11th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Beyond reward: the problem of knowledge and data

ILP'11 Proceedings of the 21st international conference on Inductive Logic Programming
Abstraction in Model Based Partially Observable Reinforcement Learning Using Extended Sequence Trees

WI-IAT '12 Proceedings of the The 2012 IEEE/WIC/ACM International Joint Conferences on Web Intelligence and Intelligent Agent Technology - Volume 02
Organizational design principles and techniques for decision-theoretic agents

Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems
Object focused q-learning for autonomous agents

Proceedings of the 2013 international conference on Autonomous agents and multi-agent systems

Quantified Score

Hi-index	0.00

Visualization

Abstract

Temporally extended actions (e.g., macro actions) have proven very useful for speeding up learning, ensuring robustness and building prior knowledge into AI systems. The options framework (Precup, 2000; Sutton, Precup & Singh, 1999) provides a natural way of incorporating such actions into reinforcement learning systems, but leaves open the issue of howgood options might be identified. In this paper, we empirically explore a simple approach to creating options. The underlying assumption is that the agent will be asked to perform different goalachievement tasks in an environment that is otherwise the same over time. Our approach is based on the intuition that states that are frequently visited on system trajectories, could prove to be useful subgoals (e.g., McGovern & Barto, 2001; Iba, 1989).We propose a greedy algorithm for identifying subgoals based on state visitation counts. We present empirical studies of this approach in two gridworld navigation tasks. One of the environments we explored contains bottleneck states, and the algorithm indeed finds these states, as expected. The second environment is an empty gridworld with no obstacles. Although the environment does not contain any obvious subgoals, our approach still finds useful options, which essentially allow the agent to explore the environment more quickly.