Model minimization in Markov decision processes

Authors:
Thomas Dean;Robert Givan
Affiliations:
Department of Computer Science, Brown University, Providence, RI;Department of Computer Science, Brown University, Providence, RI
Venue:
AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
Year:
1997

Citing 11
Cited 49

Probabilistic reasoning in intelligent systems: networks of plausible inference

Probabilistic reasoning in intelligent systems: networks of plausible inference
A model for reasoning about persistence and causation

Computational Intelligence
Online minimization of transition systems (extended abstract)

STOC '92 Proceedings of the twenty-fourth annual ACM symposium on Theory of computing
Minimal state graph generation

Science of Computer Programming
Using abstractions for decision-theoretic planning with time constraints

AAAI'94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 2)
Planning under time constraints in stochastic domains

Artificial Intelligence - Special volume on planning and scheduling
An algorithm for probabilistic planning

Artificial Intelligence - Special volume on planning and scheduling
Feature-based methods for large scale dynamic programming

Machine Learning - Special issue on reinforcement learning
Algebraic structure theory of sequential machines (Prentice-Hall international series in applied mathematics)

Algebraic structure theory of sequential machines (Prentice-Hall international series in applied mathematics)
Exploiting structure in policy construction

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Computing optimal policies for partially observable decision processes using compact representations

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 2

Solving very large weakly coupled Markov decision processes

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Optimal resource allocation in multi-class networks with user-specified utility functions

Computer Networks: The International Journal of Computer and Telecommunications Networking
An Integrated Approach of Learning, Planning, and Execution

Journal of Intelligent and Robotic Systems
Model Minimization in Hierarchical Reinforcement Learning

Proceedings of the 5th International Symposium on Abstraction, Reformulation and Approximation
VQQL. Applying Vector Quantization to Reinforcement Learning

RoboCup-99: Robot Soccer World Cup III
Solving Factored MDPs with Large Action Space Using Algebraic Decision Diagrams

PRICAI '02 Proceedings of the 7th Pacific Rim International Conference on Artificial Intelligence: Trends in Artificial Intelligence
Spatiotemporal Abstraction of Stochastic Sequential Processes

Proceedings of the 5th International Symposium on Abstraction, Reformulation and Approximation
Nearly deterministic abstractions of Markov decision processes

Eighteenth national conference on Artificial intelligence
Value iteration working with belief subset

Eighteenth national conference on Artificial intelligence
Equivalence notions and model minimization in Markov decision processes

Artificial Intelligence - special issue on planning with uncertainty and incomplete information
Solving factored MDPs using non-homogeneous partitions

Artificial Intelligence - special issue on planning with uncertainty and incomplete information
Dynamic programming for structured continuous Markov decision problems

UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
Causal Graph Based Decomposition of Factored MDPs

The Journal of Machine Learning Research
On the hardness of finding symmetries in Markov decision processes

Proceedings of the 25th international conference on Machine learning
An analysis of linear models, linear value-function approximation, and feature selection for reinforcement learning

Proceedings of the 25th international conference on Machine learning
Reinforcement Learning in Nonstationary Environment Navigation Tasks

CAI '07 Proceedings of the 20th conference of the Canadian Society for Computational Studies of Intelligence on Advances in Artificial Intelligence
Using Homomorphisms to transfer options across continuous reinforcement learning domains

AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
Abstraction in predictive state representations

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
Exploiting symmetries in POMDPs for point-based algorithms

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
Model minimization, regression, and propositional STRIPS planning

IJCAI'97 Proceedings of the Fifteenth international joint conference on Artifical intelligence - Volume 2
Efficient solution algorithms for factored MDPs

Journal of Artificial Intelligence Research
Accelerating reinforcement learning through implicit imitation

Journal of Artificial Intelligence Research
Approximate policy iteration with a policy language bias: solving relational Markov decision processes

Journal of Artificial Intelligence Research
SMDP homomorphisms: an algebraic approach to abstraction in semi-Markov decision processes

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
State abstraction discovery from irrelevant state variables

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Equivalence relations in fully and partially observable Markov decision processes

IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Generating Explanations Based on Markov Decision Processes

MICAI '09 Proceedings of the 8th Mexican International Conference on Artificial Intelligence
Transfer Learning for Reinforcement Learning Domains: A Survey

The Journal of Machine Learning Research
Magnifying-lens abstraction for Markov decision processes

CAV'07 Proceedings of the 19th international conference on Computer aided verification
Safe state abstraction and reusable continuing subtasks in hierarchical reinforcement learning

AI'07 Proceedings of the 20th Australian joint conference on Advances in artificial intelligence
An overview of planning under uncertainty

Artificial intelligence today
Structural knowledge transfer by spatial abstraction for reinforcement learning agents

Adaptive Behavior - Animals, Animats, Software Agents, Robots, Adaptive Systems
SPUDD: stochastic planning using decision diagrams

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Inductive policy selection for first-order MDPs

UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence
Policy iteration for factored MDPs

UAI'00 Proceedings of the Sixteenth conference on Uncertainty in artificial intelligence
A clustering approach to solving large stochastic matching problems

UAI'01 Proceedings of the Seventeenth conference on Uncertainty in artificial intelligence
Decision-theoretic planning with concurrent temporally extended actions

UAI'01 Proceedings of the Seventeenth conference on Uncertainty in artificial intelligence
Structured reachability analysis for Markov decision processes

UAI'98 Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence
Hierarchical solution of Markov decision processes using macro-actions

UAI'98 Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence
Model reduction techniques for computing approximately optimal solutions for Markov decision processes

UAI'97 Proceedings of the Thirteenth conference on Uncertainty in artificial intelligence
Solving hybrid markov decision processes

MICAI'06 Proceedings of the 5th Mexican international conference on Artificial Intelligence
Symbolic generalization for on-line planning

UAI'03 Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence
Monte-Carlo optimizations for resource allocation problems in stochastic network systems

UAI'03 Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence
Policy-contingent abstraction for robust robot control

UAI'03 Proceedings of the Nineteenth conference on Uncertainty in Artificial Intelligence
Structural abstraction experiments in reinforcement learning

AI'05 Proceedings of the 18th Australian Joint conference on Advances in Artificial Intelligence
Exploiting symmetries for single- and multi-agent Partially Observable Stochastic Domains

Artificial Intelligence
Abstraction and generalization in reinforcement learning: a summary and framework

ALA'09 Proceedings of the Second international conference on Adaptive and Learning Agents
Bisimulation Metrics for Continuous Markov Decision Processes

SIAM Journal on Computing
Automated reliability estimation over partial systematic explorations

Proceedings of the 2013 International Conference on Software Engineering

Quantified Score

Hi-index	0.00

Visualization

Abstract

We use the notion of stochastic bisimulation homogeneity to analyze planning problems represented as Markov decision processes (MDPs). Informally, a partition of the state space for an MDP is said to be homogeneous if for each action, states in the same block have the same probability of being carried to each other block. We provide an algorithm for finding the coarsest homogeneous refinement of any partition of the state space of an MDP. The resulting partition can be used to construct a reduced MDP which is minimal in a well defined sense and can be used to solve the original MDP. Our algorithm is an adaptation of known automata minimization algorithms, and is designed to operate naturally on factored or implicit representations in which the full state space is never explicitly enumerated. We show that simple variations on this algorithm are equivalent or closely similar to several different recently published algorithms for finding optimal solutions to (partially or fully observable) factored Markov decision processes, thereby providing alternative descriptions of the methods and results regarding those algorithms.