PEGASUS: A policy search method for large MDPs and POMDPs
UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
Algorithms for Sequential Decision Making
Algorithms for Sequential Decision Making
Predictive state representations: a new theory for modeling dynamical systems
UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
A planning algorithm for predictive state representations
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Combining memory and landmarks with predictive state representations
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Computing optimal policies for partially observable decision processes using compact representations
AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 2
Incremental pruning: a simple, fast, exact method for partially observable Markov decision processes
UAI'97 Proceedings of the Thirteenth conference on Uncertainty in artificial intelligence
Looping suffix tree-based inference of partially observable hidden state
ICML '06 Proceedings of the 23rd international conference on Machine learning
Proto-predictive representation of states with simple recurrent temporal-difference networks
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Improving approximate value iteration using memories and predictive state representations
AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
Point-based planning for predictive state representations
Canadian AI'08 Proceedings of the Canadian Society for computational studies of intelligence, 21st conference on Advances in artificial intelligence
Hi-index | 0.00 |
Models of dynamical systems based on predictive state representations (PSRs) use predictions of future observations as their representation of state. A main departure from traditional models such as partially observable Markov decision processes (POMDPs) is that the PSR-model state is composed entirely of observable quantities. PSRs have recently been extended to a class of models called memory-PSRs (mPSRs) that use both memory of past observations and predictions of future observations in their state representation. Thus, mPSRs preserve the PSR-property of the state being composed of observable quantities while potentially revealing structure in the dynamical system that is not exploited in PSRs. In this paper, we demonstrate that the structure captured by mPSRs can be exploited quite naturally for stochastic planning based on value-iteration algorithms. In particular, we adapt the incremental-pruning (IP) algorithm defined for planning in POMDPs to mPSRs. Our empirical results show that our modified IP on mPSRs outperforms, in most cases, IP on both PSRs and POMDPs.