Proceedings of the 25th international conference on Machine learning
Transfer of task representation in reinforcement learning using policy-based proto-value functions
Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 3
Regularization on Graphs with Function-adapted Diffusion Processes
The Journal of Machine Learning Research
Towards a meaningful MRA of traffic matrices
Proceedings of the 8th ACM SIGCOMM conference on Internet measurement
Basis Expansion in Natural Actor Critic Methods
Recent Advances in Reinforcement Learning
Learning Representation and Control in Markov Decision Processes: New Frontiers
Foundations and Trends® in Machine Learning
Optimal Online Learning Procedures for Model-Free Policy Evaluation
ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part II
Feature Selection for Value Function Approximation Using Bayesian Model Selection
ECML PKDD '09 Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases: Part I
Fast spectral learning using Lanczos eigenspace projections
AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 3
IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Transfer Learning for Reinforcement Learning Domains: A Survey
The Journal of Machine Learning Research
Character animation in two-player adversarial games
ACM Transactions on Graphics (TOG)
Automatic induction of bellman-error features for probabilistic planning
Journal of Artificial Intelligence Research
The Journal of Machine Learning Research
Metric learning for reinforcement learning agents
The 10th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
On the relation of slow feature analysis and laplacian eigenmaps
Neural Computation
Abstraction and generalization in reinforcement learning: a summary and framework
ALA'09 Proceedings of the Second international conference on Adaptive and Learning Agents
Basis function discovery using spectral clustering and bisimulation metrics
ALA'11 Proceedings of the 11th international conference on Adaptive and Learning Agents
Stochastic enforced hill-climbing
Journal of Artificial Intelligence Research
The successor representation and temporal context
Neural Computation
An online kernel-based clustering approach for value function approximation
SETN'12 Proceedings of the 7th Hellenic conference on Artificial Intelligence: theories and applications
Policy iteration based on a learned transition model
ECML PKDD'12 Proceedings of the 2012 European conference on Machine Learning and Knowledge Discovery in Databases - Volume Part II
A hierarchical representation policy iteration algorithm for reinforcement learning
IScIDE'12 Proceedings of the third Sino-foreign-interchange conference on Intelligent Science and Intelligent Data Engineering
Reinforcement learning algorithms with function approximation: Recent advances and applications
Information Sciences: an International Journal
Construction of approximation spaces for reinforcement learning
The Journal of Machine Learning Research
Hi-index | 0.00 |
This paper introduces a novel spectral framework for solving Markov decision processes (MDPs) by jointly learning representations and optimal policies. The major components of the framework described in this paper include: (i) A general scheme for constructing representations or basis functions by diagonalizing symmetric diffusion operators (ii) A specific instantiation of this approach where global basis functions called proto-value functions (PVFs) are formed using the eigenvectors of the graph Laplacian on an undirected graph formed from state transitions induced by the MDP (iii) A three-phased procedure called representation policy iteration comprising of a sample collection phase, a representation learning phase that constructs basis functions from samples, and a final parameter estimation phase that determines an (approximately) optimal policy within the (linear) subspace spanned by the (current) basis functions. (iv) A specific instantiation of the RPI framework using least-squares policy iteration (LSPI) as the parameter estimation method (v) Several strategies for scaling the proposed approach to large discrete and continuous state spaces, including the Nyström extension for out-of-sample interpolation of eigenfunctions, and the use of Kronecker sum factorization to construct compact eigenfunctions in product spaces such as factored MDPs (vi) Finally, a series of illustrative discrete and continuous control tasks, which both illustrate the concepts and provide a benchmark for evaluating the proposed approach. Many challenges remain to be addressed in scaling the proposed framework to large MDPs, and several elaboration of the proposed framework are briefly summarized at the end.