Learning predictive state representations in dynamical systems without reset

Authors:
Britton Wolfe;Michael R. James;Satinder Singh
Affiliations:
University of Michigan, Ann Arbor, MI;University of Michigan, Ann Arbor, MI;University of Michigan, Ann Arbor, MI
Venue:
ICML '05 Proceedings of the 22nd international conference on Machine learning
Year:
2005

Citing 5
Cited 18

Reinforcement learning with replacing eligibility traces

Machine Learning - Special issue on reinforcement learning
Discrete-time, Discrete-valued Observable Operator Models: a Tutorial

Discrete-time, Discrete-valued Observable Operator Models: a Tutorial
Learning and discovery of predictive state representations in dynamical systems with reset

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Learning low dimensional predictive representations

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Predictive state representations: a new theory for modeling dynamical systems

UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence

TD(λ) networks: temporal-difference networks with eligibility traces

ICML '05 Proceedings of the 22nd international conference on Machine learning
Learning predictive state representations using non-blind policies

ICML '06 Proceedings of the 23rd international conference on Machine learning
Predictive linear-Gaussian models of controlled stochastic dynamical systems

ICML '06 Proceedings of the 23rd international conference on Machine learning
Predictive state representations with options

ICML '06 Proceedings of the 23rd international conference on Machine learning
Learning from induced changes in opponent (re)actions in multi-agent games

AAMAS '06 Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
On discovery and learning of models with predictive representations of state for agents with continuous actions and observations

Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
Coordinating with the Future: The Anticipatory Nature of Representation

Minds and Machines
On-line discovery of temporal-difference networks

Proceedings of the 25th international conference on Machine learning
Approximate predictive state representations

Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
Proto-predictive representation of states with simple recurrent temporal-difference networks

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
A bound on modeling error in observable operator models and an associated learning algorithm

Neural Computation
Subjective mapping

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Improving approximate value iteration using memories and predictive state representations

AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
Discovery and learning of models with predictive state representations for dynamical systems without reset

Knowledge-Based Systems
Making the error-controlling algorithm of observable operator models constructive

Neural Computation
Closing the learning-planning loop with predictive state representations

International Journal of Robotics Research
An overview of cooperative and competitive multiagent learning

LAMAS'05 Proceedings of the First international conference on Learning and Adaption in Multi-Agent Systems
Learning to make predictions in partially observable environments without a generative model

Journal of Artificial Intelligence Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

Predictive state representations (PSRs) are a recently-developed way to model discrete-time, controlled dynamical systems. We present and describe two algorithms for learning a PSR model: a Monte Carlo algorithm and a temporal difference (TD) algorithm. Both of these algorithms can learn models for systems without requiring a reset action as was needed by the previously available general PSR-model learning algorithm. We present empirical results that compare our two algorithms and also compare their performance with that of existing algorithms, including an EM algorithm for learning POMDP models.