A survey of algorithmic methods for partially observed Markov decision processes
Annals of Operations Research
Diversity-based inference of finite automata
Journal of the ACM (JACM)
Algorithms for Sequential Decision Making
Algorithms for Sequential Decision Making
Observable Operator Models for Discrete Stochastic Time Series
Neural Computation
Learning topological maps with weak local odometric information
IJCAI'97 Proceedings of the Fifteenth international joint conference on Artifical intelligence - Volume 2
Predictive state representations: a new theory for modeling dynamical systems
UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
Learning predictive representations from a history
ICML '05 Proceedings of the 22nd international conference on Machine learning
Learning predictive state representations in dynamical systems without reset
ICML '05 Proceedings of the 22nd international conference on Machine learning
Learning predictive state representations using non-blind policies
ICML '06 Proceedings of the 23rd international conference on Machine learning
Predictive state representations with options
ICML '06 Proceedings of the 23rd international conference on Machine learning
Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
On-line discovery of temporal-difference networks
Proceedings of the 25th international conference on Machine learning
Approximate predictive state representations
Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
Proto-predictive representation of states with simple recurrent temporal-difference networks
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Learning partially observable deterministic action models
Journal of Artificial Intelligence Research
Relational knowledge with predictive state representations
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Combining memory and landmarks with predictive state representations
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Learning subjective representations for planning
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
An investigation into mathematical programming for finite horizon decentralized POMDPs
Journal of Artificial Intelligence Research
A Modified Memory-Based Reinforcement Learning Method for Solving POMDP Problems
Neural Processing Letters
Using learned PSR model for planning under uncertainty
AI'10 Proceedings of the 23rd Canadian conference on Advances in Artificial Intelligence
Learning to make predictions in partially observable environments without a generative model
Journal of Artificial Intelligence Research
The duality of state and observation in probabilistic transition systems
TbiLLC'11 Proceedings of the 9th international conference on Logic, Language, and Computation
Hi-index | 0.00 |
Predictive state representations (PSRs) are a recently proposed way of modeling controlled dynamical systems. PSR-based models use predictions of observable outcomes of tests that could be done on the system as their state representation, and have model parameters that define how the predictive state representation changes over time as actions are taken and observations noted. Learning PSR-based models requires solving two subproblems: 1) discovery of the tests whose predictions constitute state, and 2) learning the model parameters that define the dynamics. So far, there have been no results available on the discovery subproblem while for the learning subproblem an approximate-gradient algorithm has been proposed (Singh et al., 2003) with mixed results (it works on some domains and not on others). In this paper, we provide the first discovery algorithm and a new learning algorithm for linear PSRs for the special class of controlled dynamical systems that have a reset operation. We provide experimental verification of our algorithms. Finally, we also distinguish our work from prior work by Jaeger (2000) on observable operator models (OOMs).