Linear options

Authors:
Jonathan Sorg;Satinder Singh
Affiliations:
University of Michigan;University of Michigan
Venue:
Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Year:
2010

Citing 11
Cited 0

Integrated architecture for learning, planning, and reacting based on approximating dynamic programming

Proceedings of the seventh international conference (1990) on Machine learning
Linear least-squares algorithms for temporal difference learning

Machine Learning - Special issue on reinforcement learning
Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning

Artificial Intelligence
Technical Update: Least-Squares Temporal Difference Learning

Machine Learning
An intrinsic reward mechanism for efficient exploration

ICML '06 Proceedings of the 23rd international conference on Machine learning
An analysis of linear models, linear value-function approximation, and feature selection for reinforcement learning

Proceedings of the 25th international conference on Machine learning
Using Homomorphisms to transfer options across continuous reinforcement learning domains

AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
Hierarchical reinforcement learning with the MAXQ value function decomposition

Journal of Artificial Intelligence Research
Building portable options: skill transfer in reinforcement learning

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Efficient skill learning using abstraction selection

IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Learning relational options for inductive transfer in relational reinforcement learning

ILP'07 Proceedings of the 17th international conference on Inductive logic programming

Quantified Score

Hi-index	0.00

Visualization

Abstract

Learning, planning, and representing knowledge in large state spaces at multiple levels of temporal abstraction are key, long-standing challenges for building flexible autonomous agents. The options framework provides a formal mechanism for specifying and learning temporally-extended skills. Although past work has demonstrated the benefit of acting according to options in continuous state spaces, one of the central advantages of temporal abstraction---the ability to plan using a temporally abstract model---remains a challenging problem when the number of environment states is large or infinite. In this work, we develop a knowledge construct, the linear option, which is capable of modeling temporally abstract dynamics in continuous state spaces. We show that planning with a linear expectation model of an option's dynamics converges to a fixed point with low Temporal Difference (TD) error. Next, building on recent work on linear feature selection, we show conditions under which a linear feature set is sufficient for accurately representing the value function of an option policy. We extend this result to show conditions under which multiple options may be repeatedly composed to create new options with accurate linear models. Finally, we demonstrate linear option learning and planning algorithms in a simulated robot environment.