The size of MDP factored policies

Authors:
Paolo Liberatore
Affiliations:
Dipartimento di Informatica e Sistemistica, Università di Roma "La Sapienza", Via Salaria, 113, 00198, Roma, Italy
Venue:
Eighteenth national conference on Artificial intelligence
Year:
2002

Citing 17
Cited 5

The complexity of Markov decision processes

Mathematics of Operations Research
A model for reasoning about persistence and causation

Computational Intelligence
Abstraction and approximate decision-theoretic planning

Artificial Intelligence
On the undecidability of probabilistic planning and infinite-horizon partially observable Markov decision problems

AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Complexity of finite-horizon Markov decision process problems

Journal of the ACM (JACM)
Monotonic reductions, representative equivalence, and compilation of intractable problems

Journal of the ACM (JACM)
Computers and Intractability: A Guide to the Theory of NP-Completeness

Computers and Intractability: A Guide to the Theory of NP-Completeness
Computing Factored Value Functions for Policies in Structured MDPs

IJCAI '99 Proceedings of the Sixteenth International Joint Conference on Artificial Intelligence
Preprocessing of intractable problems

Information and Computation
The size of MDP factored policies

Eighteenth national conference on Artificial intelligence
Some connections between nonuniform and uniform complexity classes

STOC '80 Proceedings of the twelfth annual ACM symposium on Theory of computing
Dynamic Programming

Dynamic Programming
Speeding up the convergence of value iteration in partially observable Markov decision processes

Journal of Artificial Intelligence Research
The comparative linguistics of knowledge representation

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 1
Probabilistic propositional planning: representations and complexity

AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
On the complexity of solving Markov decision problems

UAI'95 Proceedings of the Eleventh conference on Uncertainty in artificial intelligence
Incremental pruning: a simple, fast, exact method for partially observable Markov decision processes

UAI'97 Proceedings of the Thirteenth conference on Uncertainty in artificial intelligence

The size of MDP factored policies

Eighteenth national conference on Artificial intelligence
Factored value iteration converges

Acta Cybernetica
Factored temporal difference learning in the new ties environment

Acta Cybernetica
Optimistic initialization and greediness lead to polynomial time learning in factored MDPs

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning
On polynomial sized MDP succinct policies

Journal of Artificial Intelligence Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

Policies of Markov Decision Processes (MDPs) tell the next action to execute, given the current state and (possibly) the history of actions executed so far. Factorization is used when the number of states is exponentially large: both the MDP and the policy can be then represented using a compact form, for example employing circuits. We prove that there are MDPs whose optimal policies require exponential space evenin factored form.