Probabilistic reasoning in intelligent systems: networks of plausible inference
Probabilistic reasoning in intelligent systems: networks of plausible inference
A model for reasoning about persistence and causation
Computational Intelligence
Online minimization of transition systems (extended abstract)
STOC '92 Proceedings of the twenty-fourth annual ACM symposium on Theory of computing
Minimal state graph generation
Science of Computer Programming
Using abstractions for decision-theoretic planning with time constraints
AAAI'94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 2)
Planning under time constraints in stochastic domains
Artificial Intelligence - Special volume on planning and scheduling
An algorithm for probabilistic planning
Artificial Intelligence - Special volume on planning and scheduling
Aggregation Methods for Large Markov Chains
Proceedings of the International Workshop on Computer Performance and Reliability
Bounded Parameter Markov Decision Processes
Bounded Parameter Markov Decision Processes
Algebraic structure theory of sequential machines (Prentice-Hall international series in applied mathematics)
Exploiting structure in policy construction
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Model minimization in Markov decision processes
AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
Optimal resource allocation in multi-class networks with user-specified utility functions
Computer Networks: The International Journal of Computer and Telecommunications Networking
Piecewise linear value function approximation for factored MDPs
Eighteenth national conference on Artificial intelligence
Equivalence notions and model minimization in Markov decision processes
Artificial Intelligence - special issue on planning with uncertainty and incomplete information
Solving factored MDPs using non-homogeneous partitions
Artificial Intelligence - special issue on planning with uncertainty and incomplete information
Metrics for finite Markov decision processes
UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
On the relationship between MDPs and the BDI architecture
AAMAS '06 Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
Model minimization, regression, and propositional STRIPS planning
IJCAI'97 Proceedings of the Fifteenth international joint conference on Artifical intelligence - Volume 2
Journal of Artificial Intelligence Research
A model approximation scheme for planning in partially observable stochastic domains
Journal of Artificial Intelligence Research
Computing factored value functions for policies in structured MDPs
IJCAI'99 Proceedings of the 16th international joint conference on Artificial intelligence - Volume 2
Effective control knowledge transfer through learning skill and representation hierarchies
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Domain-independent, automatic partitioning for probabilistic planning
IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Computing and using lower and upper bounds for action elimination in MDP planning
SARA'07 Proceedings of the 7th International conference on Abstraction, reformulation, and approximation
Fast value iteration for goal-directed Markov decision processes
UAI'97 Proceedings of the Thirteenth conference on Uncertainty in artificial intelligence
On reduction criteria for probabilistic reward models
FSTTCS'06 Proceedings of the 26th international conference on Foundations of Software Technology and Theoretical Computer Science
Policy-contingent abstraction for robust robot control
UAI'03 Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence
Probabilistic verification of uncertain systems using bounded-parameter markov decision processes
MDAI'06 Proceedings of the Third international conference on Modeling Decisions for Artificial Intelligence
Bisimulation Metrics for Continuous Markov Decision Processes
SIAM Journal on Computing
Q-Tree: automatic construction of hierarchical state representation for reinforcement learning
ICIRA'12 Proceedings of the 5th international conference on Intelligent Robotics and Applications - Volume Part III
Hi-index | 0.00 |
We present a method for solving implicit (factored) Markov decision processes (MDPs) with very large state spaces. We introduce a property of state space partitions which we call ε-homogeneity. Intuitively, an ε-homogeneous partition groups together states that behave approximately the same under all or some subset of policies. Borrowing from recent work on model minimization in computer-aided software verification, we present an algorithm that takes a factored representation of an MDP and an 0 ≤ ε ≤ 1 and computes a factored ε-homogeneous partition of the state space. This partition defines a family of related MDPs--those MDP's with state space equal to the blocks of the partition, and transition probabilities "approximately" like those of any (original MDP) state in the source block. To formally study such families of MDPs, we introduce the new notion of a "bounded parameter MDP" (BMDP), which is a family of (traditional) MDPs defined by specifying upper and lower bounds on the transition probabilities and rewards. We describe algorithms that operate on BMDPs to find policies that are approximately optimal with respect to the original MDP. In combination, our method for reducing a large implicit MDP to a possibly much smaller BMDP using an ε-homogeneous partition, and our methods for selecting actions in BMDP's constitute a new approach for analyzing large implicit MDP's. Among its advantages, this new approach provides insight into existing algorithms to solving implicit MDPs, provides useful connections to work in automata theory and model minimization, and suggests methods, which involve varying ε, to trade time and space (specifically in terms of the size of the corresponding state space) for solution quality.