Algebraic laws for nondeterminism and concurrency
Journal of the ACM (JACM)
Probabilistic reasoning in intelligent systems: networks of plausible inference
Probabilistic reasoning in intelligent systems: networks of plausible inference
A model for reasoning about persistence and causation
Computational Intelligence
Operational and algebraic semantics of concurrent processes
Handbook of theoretical computer science (vol. B)
Bisimulation through probabilistic testing
Information and Computation
Online minimization of transition systems (extended abstract)
STOC '92 Proceedings of the twenty-fourth annual ACM symposium on Theory of computing
Minimal state graph generation
Science of Computer Programming
Modeling a dynamic and uncertain world I: symbolic and probabilistic reasoning about change
Artificial Intelligence
Using abstractions for decision-theoretic planning with time constraints
AAAI'94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 2)
An algorithm for probabilistic planning
Artificial Intelligence - Special volume on planning and scheduling
Abstraction and approximate decision-theoretic planning
Artificial Intelligence
Stochastic dynamic programming with factored representations
Artificial Intelligence
Communication and Concurrency
A Sparse Sampling Algorithm for Near-Optimal Planning in Large Markov Decision Processes
IJCAI '99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence
Dynamic Non-uniform Abstractions for Approximate Planning in Large Structured Stochastic Domains
PRICAI '98 Proceedings of the 5th Pacific Rim International Conference on Artificial Intelligence: Topics in Artificial Intelligence
Policy Iteration for Factored MDPs
UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
Concurrency and Automata on Infinite Sequences
Proceedings of the 5th GI-Conference on Theoretical Computer Science
Dynamic Programming
The Witness Algorithm: Solving Partially Observable Markov Decision Processes
The Witness Algorithm: Solving Partially Observable Markov Decision Processes
Projecting plans for uncertain worlds
Projecting plans for uncertain worlds
Algebraic structure theory of sequential machines (Prentice-Hall international series in applied mathematics)
Speeding up the convergence of value iteration in partially observable Markov decision processes
Journal of Artificial Intelligence Research
Exploiting structure in policy construction
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Computing optimal policies for partially observable decision processes using compact representations
AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 2
Model minimization in Markov decision processes
AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
Probabilistic propositional planning: representations and complexity
AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
SPUDD: stochastic planning using decision diagrams
UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
On the complexity of solving Markov decision problems
UAI'95 Proceedings of the Eleventh conference on Uncertainty in artificial intelligence
Incremental pruning: a simple, fast, exact method for partially observable Markov decision processes
UAI'97 Proceedings of the Thirteenth conference on Uncertainty in artificial intelligence
UAI'97 Proceedings of the Thirteenth conference on Uncertainty in artificial intelligence
Model Minimization in Hierarchical Reinforcement Learning
Proceedings of the 5th International Symposium on Abstraction, Reformulation and Approximation
Metrics for finite Markov decision processes
UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
Coordinating Multiple Agents via Reinforcement Learning
Autonomous Agents and Multi-Agent Systems
Exact solutions of interactive POMDPs using behavioral equivalence
AAMAS '06 Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
Exact finite approximations of average-cost countable Markov decision processes
Automatica (Journal of IFAC)
On the hardness of finding symmetries in Markov decision processes
Proceedings of the 25th international conference on Machine learning
Pseudometrics for State Aggregation in Average Reward Markov Decision Processes
ALT '07 Proceedings of the 18th international conference on Algorithmic Learning Theory
Transfer via soft homomorphisms
Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 2
An Inductive Logic Programming Approach to Statistical Relational Learning
Proceedings of the 2005 conference on An Inductive Logic Programming Approach to Statistical Relational Learning
Metrics for finite Markov decision processes
AAAI'04 Proceedings of the 19th national conference on Artifical intelligence
Decision tree methods for finding reusable MDP homomorphisms
AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
Journal of Artificial Intelligence Research
Closed-loop learning of visual control policies
Journal of Artificial Intelligence Research
State similarity based approach for improving performance in RL
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Efficiently exploiting symmetries in real time dynamic programming
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Efficiently exploiting symmetries in real time dynamic programming
IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
SMDP homomorphisms: an algebraic approach to abstraction in semi-Markov decision processes
IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Intensional dynamic programming. A Rosetta stone for structured dynamic programming
Journal of Algorithms
Equivalence relations in fully and partially observable Markov decision processes
IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Generating Explanations Based on Markov Decision Processes
MICAI '09 Proceedings of the 8th Mexican International Conference on Artificial Intelligence
Scalable approach for effective control of gene regulatory networks
Artificial Intelligence in Medicine
Computing and using lower and upper bounds for action elimination in MDP planning
SARA'07 Proceedings of the 7th International conference on Abstraction, reformulation, and approximation
Incremental plan aggregation for generating policies in MDPs
Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
Learning from demonstration using MDP induced metrics
ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part II
Probabilistic Logical Characterization
Information and Computation
Model refinement using bisimulation quotients
AMAST'10 Proceedings of the 13th international conference on Algebraic methodology and software technology
Computers and Electronics in Agriculture
Using rewards for belief state updates in partially observable markov decision processes
ECML'05 Proceedings of the 16th European conference on Machine Learning
Exploiting symmetries for single- and multi-agent Partially Observable Stochastic Domains
Artificial Intelligence
Data-driven dynamic emulation modelling for the optimal management of environmental systems
Environmental Modelling & Software
Lossy stochastic game abstraction with bounds
Proceedings of the 13th ACM Conference on Electronic Commerce
Bisimulation Metrics for Continuous Markov Decision Processes
SIAM Journal on Computing
Feature reinforcement learning in practice
EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
Automatic construction of temporally extended actions for MDPs using bisimulation metrics
EWRL'11 Proceedings of the 9th European conference on Recent Advances in Reinforcement Learning
Exploiting model equivalences for solving interactive dynamic influence diagrams
Journal of Artificial Intelligence Research
Bisimulation and logical preservation for continuous-time markov decision processes
CONCUR'07 Proceedings of the 18th international conference on Concurrency Theory
Model selection in markovian processes
Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Incremental clustering and expansion for faster optimal planning in decentralized POMDPs
Journal of Artificial Intelligence Research
Cost preserving bisimulations for probabilistic automata
CONCUR'13 Proceedings of the 24th international conference on Concurrency Theory
The bisimdist library: efficient computation of bisimilarity distances for markovian models
QEST'13 Proceedings of the 10th international conference on Quantitative Evaluation of Systems
Electronic Notes in Theoretical Computer Science (ENTCS)
Scheduling sensors for monitoring sentient spaces using an approximate POMDP policy
Pervasive and Mobile Computing
Hi-index | 0.00 |
Many stochastic planning problems can be represented using Markov Decision Processes (MDPs). A difficulty with using these MDP representations is that the common algorithms for solving them run in time polynomial in the size of the state space, where this size is extremely large for most real-world planning problems of interest. Recent AI research has addressed this problem by representing the MDP in a factored form. Factored MDPs, however, are not amenable to traditional solution methods that call for an explicit enumeration of the state space. One familiar way to solve MDP problems with very large state spaces is to form a reduced (or aggregated) MDP with the same properties as the original MDP by combining "equivalent" states. In this paper, we discuss applying this approach to solving factored MDP problems--we avoid enumerating the state space by describing large blocks of "equivalent" states in factored form, with the block descriptions being inferred directly from the original factored representation. The resulting reduced MDP may have exponentially fewer states than the original factored MDP, and can then be solved using traditional methods. The reduced MDP found depends on the notion of equivalence between states used in the aggregation. The notion of equivalence chosen will be fundamental in designing and analyzing algorithms for reducing MDPs. Optimally, these algorithms will be able to find the smallest possible reduced MDP for any given input MDP and notion of equivalence (i.e., find the "minimal model" for the input MDP). Unfortunately, the classic notion of state equivalence from non-deterministic finite state machines generalized to MDPs does not prove useful. We present here a notion of equivalence that is based upon the notion of bisimulation from the literature on concurrent processes. Our generalization of bisimulation to stochastic processes yields a non-trivial notion of state equivalence that guarantees the optimal policy for the reduced model immediately induces a corresponding Optimal policy for the original model. With this notion of state equivalence, we design and analyze an algorithm that minimizes arbitrary factored MDPs and compare this method analytically to previous algorithms for solving factored MDPs. We show that previous approaches implicitly derive equivalence relations that we define here.