Complexity of finding embeddings in a k-tree
SIAM Journal on Algebraic and Discrete Methods
Bucket elimination: a unifying framework for reasoning
Artificial Intelligence
Stochastic dynamic programming with factored representations
Artificial Intelligence
Introduction to Reinforcement Learning
Introduction to Reinforcement Learning
Neuro-Dynamic Programming
Algorithm-Directed Exploration for Model-Based Reinforcement Learning in Factored MDPs
ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Efficient Reinforcement Learning in Factored MDPs
IJCAI '99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence
Policy Iteration for Factored MDPs
UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
The size of MDP factored policies
Eighteenth national conference on Artificial intelligence
Greedy linear value-approximation for factored Markov decision processes
Eighteenth national conference on Artificial intelligence
Piecewise linear value function approximation for factored MDPs
Eighteenth national conference on Artificial intelligence
Reinforcement learning for factored markov decision processes
Reinforcement learning for factored markov decision processes
Least-squares policy iteration
The Journal of Machine Learning Research
On Constraint Sampling in the Linear Programming Approach to Approximate Dynamic Programming
Mathematics of Operations Research
Sampling algorithms for l2 regression and applications
SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
Fast Monte Carlo Algorithms for Matrices I: Approximating Matrix Multiplication
SIAM Journal on Computing
Efficient solution algorithms for factored MDPs
Journal of Artificial Intelligence Research
Solving factored MDPs with hybrid state and action variables
Journal of Artificial Intelligence Research
Generalizing plans to new environments in relational MDPs
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Exploiting structure in policy construction
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Factored temporal difference learning in the new ties environment
Acta Cybernetica
Optimistic initialization and greediness lead to polynomial time learning in factored MDPs
ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
Learning the states: a brain inspired neural model
AGI'11 Proceedings of the 4th international conference on Artificial general intelligence
ALA'09 Proceedings of the Second international conference on Adaptive and Learning Agents
Hi-index | 0.01 |
In this paper we propose a novel algorithm, factored value iteration (FVI), for the approximate solution of factored Markov decision processes (fMDPs). The traditional approximate value iteration algorithm is modified in two ways. For one, the least-squares projection operator is modified so that it does not increase max-norm, and thus preserves convergence. The other modification is that we uniformly sample polynomially many samples from the (exponentially large) state space. This way, the complexity of our algorithm becomes polynomial in the size of the fMDP description length. We prove that the algorithm is convergent. We also derive an upper bound on the difference between our approximate solution and the optimal one, and also on the error introduced by sampling. We analyse various projection operators with respect to their computation complexity and their convergence when combined with approximate value iteration.