Predictive state representations with options

Authors:
Britton Wolfe;Satinder Singh
Affiliations:
University of Michigan, Ann Arbor, MI;University of Michigan, Ann Arbor, MI
Venue:
ICML '06 Proceedings of the 23rd international conference on Machine learning
Year:
2006

Citing 8
Cited 4

Between MDPs and semi-MDPs: a framework for temporal abstraction in reinforcement learning

Artificial Intelligence
The MAXQ Method for Hierarchical Reinforcement Learning

ICML '98 Proceedings of the Fifteenth International Conference on Machine Learning
Recent Advances in Hierarchical Reinforcement Learning

Discrete Event Dynamic Systems
Learning and discovery of predictive state representations in dynamical systems with reset

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Predictive state representations: a new theory for modeling dynamical systems

UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
Learning predictive state representations in dynamical systems without reset

ICML '05 Proceedings of the 22nd international conference on Machine learning
Combining memory and landmarks with predictive state representations

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Hierarchical solution of Markov decision processes using macro-actions

UAI'98 Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence

Coordinating with the Future: The Anticipatory Nature of Representation

Minds and Machines
A bound on modeling error in observable operator models and an associated learning algorithm

Neural Computation
Making the error-controlling algorithm of observable operator models constructive

Neural Computation
Learning to make predictions in partially observable environments without a generative model

Journal of Artificial Intelligence Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recent work on predictive state representation (PSR) models has focused on using predictions of the outcomes of open-loop action sequences as state. These predictions answer questions of the form "What is the probability of seeing observation sequence o1, o2, ..., oN if the agent takes action sequence a1, a2, ..., aN from some given history?" We would like to ask more expressive questions in our representation of state, such as "If I behave according to some policy until I terminate, what will be my last observation?" We extend the linear PSR framework to answer questions like these about options -- temporally extended, closed-loop courses of action -- bounding the size of the linear PSR needed to model questions about a certain class of options. We introduce a hierarchical PSR (HPSR) that can make predictions about both options and primitive action sequences and show empirical results from learning HPSRs in simple domains.