A fast algorithm for particle simulations
Journal of Computational Physics
Recent Advances in Hierarchical Reinforcement Learning
Discrete Event Dynamic Systems
Laplacian Eigenmaps for dimensionality reduction and data representation
Neural Computation
Least-squares policy iteration
The Journal of Machine Learning Research
Adaptive mesh compression in 3D computer graphics using multiscale manifold learning
Proceedings of the 24th international conference on Machine learning
Learning Representation and Control in Markov Decision Processes: New Frontiers
Foundations and Trends® in Machine Learning
Learning representation and control in continuous Markov decision processes
AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
An analysis of Laplacian methods for value function approximation in MDPs
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Multiscale analysis of document corpora based on diffusion models
IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Approximate dynamic programming using Bellman residual elimination and Gaussian process regression
ACC'09 Proceedings of the 2009 conference on American Control Conference
Hi-index | 0.00 |
Policy evaluation is a critical step in the approximate solution of large Markov decision processes (MDPs), typically requiring O(|S|3) to directly solve the Bellman system of |S| linear equations (where |S| is the state space size in the discrete case, and the sample size in the continuous case). In this paper we apply a recently introduced multiscale framework for analysis on graphs to design a faster algorithm for policy evaluation. For a fixed policy π, this framework efficiently constructs a multiscale decomposition of the random walk Pπ associated with the policy π. This enables efficiently computing medium and long term state distributions, approximation of value functions, and the direct computation of the potential operator (I - γPπ)-1 needed to solve Bellman's equation. We show that even a preliminary non-optimized version of the solver competes with highly optimized iterative techniques, requiring in many cases a complexity of O(|S|).