A Theory for Multiresolution Signal Decomposition: The Wavelet Representation
IEEE Transactions on Pattern Analysis and Machine Intelligence
Computational frameworks for the fast Fourier transform
Computational frameworks for the fast Fourier transform
Ten lectures on wavelets
Automatic programming of behavior-based robots using reinforcement learning
Artificial Intelligence
Sensitivity of the Stationary Distribution of a Markov Chain
SIAM Journal on Matrix Analysis and Applications
TD-Gammon, a self-teaching backgammon program, achieves master-level play
Neural Computation
Artificial intelligence: a modern approach
Artificial intelligence: a modern approach
Linear least-squares algorithms for temporal difference learning
Machine Learning - Special issue on reinforcement learning
A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs
SIAM Journal on Scientific Computing
Successive matrix squaring algorithm for computing the Drazin inverse
Applied Mathematics and Computation
Robot Motion Planning
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond
Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond
Neuro-Dynamic Programming
The Relations Among Potentials, Perturbation Analysis,and Markov Decision Processes
Discrete Event Dynamic Systems
Kernel-Based Reinforcement Learning
Machine Learning
Structure in the Space of Value Functions
Machine Learning
Recent Advances in Hierarchical Reinforcement Learning
Discrete Event Dynamic Systems
Learning to Predict by the Methods of Temporal Differences
Machine Learning
Neural Computation
Spectral Partitioning with Indefinite Kernels Using the Nyström Extension
ECCV '02 Proceedings of the 7th European Conference on Computer Vision-Part III
Least-Squares Temporal Difference Learning
ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
Using Temporal Neighborhoods to Adapt Function Approximators in Reinforcement Learning
IWANN '99 Proceedings of the International Work-Conference on Artificial and Natural Neural Networks: Foundations and Tools for Neural Modeling
Greedy linear value-approximation for factored Markov decision processes
Eighteenth national conference on Artificial intelligence
Fast Monte-Carlo Algorithms for finding low-rank approximations
FOCS '98 Proceedings of the 39th Annual Symposium on Foundations of Computer Science
Iterative Methods for Sparse Linear Systems
Iterative Methods for Sparse Linear Systems
Stable Function Approximation in Dynamic Programming
Stable Function Approximation in Dynamic Programming
A theory of justified reformulations
A theory of justified reformulations
Learning and value function approximation in complex decision processes
Learning and value function approximation in complex decision processes
Autonomous discovery of temporal abstractions from interaction with an environment
Autonomous discovery of temporal abstractions from interaction with an environment
Least-squares policy iteration
The Journal of Machine Learning Research
Convex Optimization
Semi-Supervised Learning on Riemannian Manifolds
Machine Learning
Dynamic abstraction in reinforcement learning via clustering
ICML '04 Proceedings of the twenty-first international conference on Machine learning
Reinforcement Learning with Factored States and Actions
The Journal of Machine Learning Research
Diffusion Kernels on Statistical Manifolds
The Journal of Machine Learning Research
Monte Carlo Statistical Methods (Springer Texts in Statistics)
Monte Carlo Statistical Methods (Springer Texts in Statistics)
ICML '05 Proceedings of the 22nd international conference on Machine learning
Proto-value functions: developmental reinforcement learning
ICML '05 Proceedings of the 22nd international conference on Machine learning
Coarticulation: an approach for generating concurrent plans in Markov decision processes
ICML '05 Proceedings of the 22nd international conference on Machine learning
Updating Markov Chains with an Eye on Google's PageRank
SIAM Journal on Matrix Analysis and Applications
Fast direct policy evaluation using multiscale analysis of Markov diffusion processes
ICML '06 Proceedings of the 23rd international conference on Machine learning
On the Nyström Method for Approximating a Gram Matrix for Improved Kernel-Based Learning
The Journal of Machine Learning Research
Planning Algorithms
Constructing basis functions from directed graphs for value function approximation
Proceedings of the 24th international conference on Machine learning
Learning state-action basis functions for hierarchical MDPs
Proceedings of the 24th international conference on Machine learning
Analyzing feature generation for value-function approximation
Proceedings of the 24th international conference on Machine learning
Graph Laplacians and their Convergence on Random Neighborhood Graphs
The Journal of Machine Learning Research
The Journal of Machine Learning Research
Approximate Dynamic Programming: Solving the Curses of Dimensionality (Wiley Series in Probability and Statistics)
Proceedings of the 25th international conference on Machine learning
Hierarchical Average Reward Reinforcement Learning
The Journal of Machine Learning Research
Graph Laplacian based transfer learning in reinforcement learning
Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 3
Representation Discovery using Harmonic Analysis
Representation Discovery using Harmonic Analysis
A Wavelet Tour of Signal Processing, Third Edition: The Sparse Way
A Wavelet Tour of Signal Processing, Third Edition: The Sparse Way
Nonlinear Dimensionality Reduction
Nonlinear Dimensionality Reduction
Learning basis functions in hybrid domains
AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Learning representation and control in continuous Markov decision processes
AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Compact spectral bases for value function approximation using Kronecker factorization
AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
Fast spectral learning using Lanczos eigenspace projections
AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 3
Accelerating reinforcement learning by composing solutions of automatically identified subtasks
Journal of Artificial Intelligence Research
Efficient solution algorithms for factored MDPs
Journal of Artificial Intelligence Research
An analysis of Laplacian methods for value function approximation in MDPs
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
SMDP homomorphisms: an algebraic approach to abstraction in semi-Markov decision processes
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
A reinforcement learning approach to job-shop scheduling
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Action-based representation discovery in markov decision processes
Action-based representation discovery in markov decision processes
SPUDD: stochastic planning using decision diagrams
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Hi-index | 0.00 |
This paper describes a novel machine learning framework for solving sequential decision problems called Markov decision processes (MDPs) by iteratively computing low-dimensional representations and approximately optimal policies. A unified mathematical framework for learning representation and optimal control in MDPs is presented based on a class of singular operators called Laplacians, whose matrix representations have nonpositive off-diagonal elements and zero row sums. Exact solutions of discounted and average-reward MDPs are expressed in terms of a generalized spectral inverse of the Laplacian called the Drazin inverse. A generic algorithm called representation policy iteration (RPI) is presented which interleaves computing low-dimensional representations and approximately optimal policies. Two approaches for dimensionality reduction of MDPs are described based on geometric and reward-sensitive regularization, whereby low-dimensional representations are formed by diagonalization or dilation of Laplacian operators. Model-based and model-free variants of the RPI algorithm are presented; they are also compared experimentally on discrete and continuous MDPs. Some directions for future work are finally outlined.