Solving factored MDPs using non-homogeneous partitions

Authors:
Kee-Eung Kim;Thomas Dean
Affiliations:
IT Research Center, Samsung SDS, 159-9 Gumi-Dong Bundang-Gu, Seongnam-Si Gyeonggi-Do, 463-810, South Korea;Department of Computer Science, Brown University, Providence, RI
Venue:
Artificial Intelligence - special issue on planning with uncertainty and incomplete information
Year:
2003

Citing 24
Cited 3

The complexity of Markov decision processes

Mathematics of Operations Research
A model for reasoning about persistence and causation

Computational Intelligence
An Upper Bound on the Loss from Approximate Optimal-Value Functions

Machine Learning
Linear least-squares algorithms for temporal difference learning

Machine Learning - Special issue on reinforcement learning
Algebraic decision diagrams and their applications

ICCAD '93 Proceedings of the 1993 IEEE/ACM international conference on Computer-aided design
On the undecidability of probabilistic planning and infinite-horizon partially observable Markov decision problems

AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Finite-sample convergence rates for Q-learning and indirect algorithms

Proceedings of the 1998 conference on Advances in neural information processing systems II
Bounded-parameter Markov decision process

Artificial Intelligence
Markov Decision Processes: Discrete Stochastic Dynamic Programming

Markov Decision Processes: Discrete Stochastic Dynamic Programming
Neuro-Dynamic Programming

Neuro-Dynamic Programming
Introduction To Automata Theory, Languages, And Computation

Introduction To Automata Theory, Languages, And Computation
Least-Squares Temporal Difference Learning

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes

IJCAI '99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence
Variable Resolution Discretization for High-Accuracy Solutions of Optimal Control Problems

IJCAI '99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence
Computing Factored Value Functions for Policies in Structured MDPs

IJCAI '99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence
Policy Iteration for Factored MDPs

UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
Stable Function Approximation in Dynamic Programming

Stable Function Approximation in Dynamic Programming
The computational complexity of probabilistic planning

Journal of Artificial Intelligence Research
Exploiting structure in policy construction

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
The BATmobile: towards a Bayesian automated taxi

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Model minimization in Markov decision processes

AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
SPUDD: stochastic planning using decision diagrams

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Model reduction techniques for computing approximately optimal solutions for Markov decision processes

UAI'97 Proceedings of the Thirteenth conference on Uncertainty in artificial intelligence
Context-specific independence in Bayesian networks

UAI'96 Proceedings of the Twelfth international conference on Uncertainty in artificial intelligence

An Inductive Logic Programming Approach to Statistical Relational Learning

Proceedings of the 2005 conference on An Inductive Logic Programming Approach to Statistical Relational Learning
Autonomous subgoal discovery and hierarchical abstraction for reinforcement learning using Monte Carlo method

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 4
Effective control knowledge transfer through learning skill and representation hierarchies

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present an algorithm for aggregating states in solving large MDPs (represented as factored MDPs) using search by successive refinement in the space of non-homogeneous partitions. Homogeneity is defined in terms of stochastic bisimulation and reward equivalence within blocks of a partition. Since homogeneous partitions that define equivalent reduced-state-space MDPs can have a large number of blocks, we relax the requirement of homogeneity. The algorithm constructs approximate aggregate MDPs from non-homogeneous partitions, solves the aggregate MDPs exactly, and then uses the resulting value functions as part of a heuristic in refining the current best nonhomogeneous partition. We outline the theory motivating the use of this heuristic and present empirical results. In addition to investigating more exhaustive local search methods we explore the use of techniques derived from research on discretizing continuous state spaces. Finally, we compare the results from our algorithms which search in the space of non-homogeneous partitions with exact and approximate algorithms which represent homogeneous and approximately homogeneous partitions as decision trees or algebraic decision diagrams.