Learning to make predictions in partially observable environments without a generative model

Authors:
Erik Talvitie;Satinder Singh
Affiliations:
Mathematics and Computer Science, Franklin and Marshall College, Lancaster, PA;Computer Science and Engineering, University of Michigan, Ann Arbor, MI
Venue:
Journal of Artificial Intelligence Research
Year:
2011

Citing 25
Cited 0

Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning

Machine Learning
Diversity-based inference of finite automata

Journal of the ACM (JACM)
Markov Decision Processes: Discrete Stochastic Dynamic Programming

Markov Decision Processes: Discrete Stochastic Dynamic Programming
Reinforcement Learning in POMDP's via Direct Gradient Ascent

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Reinforcement learning with selective perception and hidden state

Reinforcement learning with selective perception and hidden state
Algorithms for sequential decision-making

Algorithms for sequential decision-making
Finite-memory control of partially observable systems

Finite-memory control of partially observable systems
Learning and discovery of predictive state representations in dynamical systems with reset

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Blind construction of optimal nonlinear recursive predictors for discrete sequences

UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
Predictive state representations: a new theory for modeling dynamical systems

UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
Learning predictive state representations in dynamical systems without reset

ICML '05 Proceedings of the 22nd international conference on Machine learning
Learning predictive state representations using non-blind policies

ICML '06 Proceedings of the 23rd international conference on Machine learning
Looping suffix tree-based inference of partially observable hidden state

ICML '06 Proceedings of the 23rd international conference on Machine learning
Predictive state representations with options

ICML '06 Proceedings of the 23rd international conference on Machine learning
Natural Actor-Critic

Neurocomputing
Approximate predictive state representations

Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
Exponential family predictive representations of state

Exponential family predictive representations of state
Decision tree methods for finding reusable MDP homomorphisms

AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
Abstraction in predictive state representations

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
Relational knowledge with predictive state representations

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Maintaining predictions over time without a model

IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
On predictive linear gaussian models

On predictive linear gaussian models
Paying attention to what matters: observation abstraction in partially observable environments

Paying attention to what matters: observation abstraction in partially observable environments
Simple partial models for complex dynamical systems

Simple partial models for complex dynamical systems
The optimal reward baseline for gradient-based reinforcement learning

UAI'01 Proceedings of the Seventeenth conference on Uncertainty in artificial intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

When faced with the problem of learning a model of a high-dimensional environment, a common approach is to limit the model to make only a restricted set of predictions, thereby simplifying the learning problem. These partial models may be directly useful for making decisions or may be combined together to form a more complete, structured model. However, in partially observable (non-Markov) environments, standard model-learning methods learn generative models, i.e. models that provide a probability distribution over all possible futures (such as POMDPs). It is not straightforward to restrict such models to make only certain predictions, and doing so does not always simplify the learning problem. In this paper we present prediction profile models: non-generative partial models for partially observable systems that make only a given set of predictions, and are therefore far simpler than generative models in some cases. We formalize the problem of learning a prediction profile model as a transformation of the original model-learning problem, and show empirically that one can learn prediction profile models that make a small set of important predictions even in systems that are too complex for standard generative models.