Planning and acting in partially observable stochastic domains
Artificial Intelligence
Learning and discovery of predictive state representations in dynamical systems with reset
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Learning low dimensional predictive representations
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Predictive state representations: a new theory for modeling dynamical systems
UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
Learning predictive representations from a history
ICML '05 Proceedings of the 22nd international conference on Machine learning
Learning predictive state representations in dynamical systems without reset
ICML '05 Proceedings of the 22nd international conference on Machine learning
Using predictions for planning and modeling in stochastic environments
Using predictions for planning and modeling in stochastic environments
Learning predictive state representations using non-blind policies
ICML '06 Proceedings of the 23rd international conference on Machine learning
Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
Combining memory and landmarks with predictive state representations
IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Using learned PSR model for planning under uncertainty
AI'10 Proceedings of the 23rd Canadian conference on Advances in Artificial Intelligence
Hi-index | 0.00 |
Modeling dynamical systems is a common problem in science and engineering. After a system has been modeled, the system can be controlled and predicted. Predictive state representations (PSRs) is a recently proposed method of modeling controlled dynamical systems. One central problem in the PSRs literature is concerned with discovery and learning of PSRs. This paper presents a new algorithm for discovery and learning of PSRs by using only a continuous trace of actions and observations as the training data, in which the history at any time step in the training data can be identified, and then the prediction of test at a history and the PSR model of the system can be obtained. We empirically evaluate and compare our algorithm on a standard set of POMDP test problems and the empirical results show that our algorithm is competitive and outperforms the suffix-history algorithm.