Complexity of finite-horizon Markov decision process problems

Authors:
Martin Mundhenk;Judy Goldsmith;Christopher Lusena;Eric Allender
Affiliations:
Univ. Trier, Trier, Germany;Univ. of Kentucky, Lexington, KY;Univ. of Kentucky, Lexington, KY;Rutgers Univ., Piscataway, NJ
Venue:
Journal of the ACM (JACM)
Year:
2000

Citing 67
Cited 35

Succinct representations of graphs

Information and Control
A note on succinct representations of graphs

Information and Control
The complexity of combinatorial problems with succinct input representation

Acta Informatica
Intractable problems in control theory

SIAM Journal on Control and Optimization
Planning for conjunctive goals

Artificial Intelligence
The complexity of Markov decision processes

Mathematics of Operations Research
Polynomial space counting problems

SIAM Journal on Computing
On the Complexity of VLSI Implementations and Graph Representations of Boolean Functions with Application to Integer Multiplication

IEEE Transactions on Computers
A Survey of solution techniques for the partially observed Markov decision process

Annals of Operations Research
A survey of algorithmic methods for partially observed Markov decision processes

Annals of Operations Research
Complexity classes defined by counting quantifiers

Journal of the ACM (JACM)
PP is as hard as the polynomial-time hierarchy

SIAM Journal on Computing
A very hard log-space counting class

Theoretical Computer Science - Special issue on structure in complexity theory
The complexity of algorithmic problems on succinct instances

Computer science
Gap-definable counting classes

Journal of Computer and System Sciences
The computational complexity of propositional STRIPS planning

Artificial Intelligence
Acting optimally in partially observable stochastic domains

AAAI'94 Proceedings of the twelfth national conference on Artificial intelligence (vol. 2)
PP is closed under intersection

Selected papers of the 23rd annual ACM symposium on Theory of computing
Expressive equivalence of planning formalisms

Artificial Intelligence - Special volume on planning and scheduling
Complexity, decidability and undecidability results for domain-independent planning

Artificial Intelligence - Special volume on planning and scheduling
An algorithm for probabilistic planning

Artificial Intelligence - Special volume on planning and scheduling
On the complexity of partially observed Markov decision processes

Theoretical Computer Science - Special issue on complexity theory and the theory of algorithms as developed in the CIS
Approximation algorithms for NP-hard problems

Approximation algorithms for NP-hard problems
Using caching to solve larger probabilistic planning problems

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Reinforcement learning

The handbook of brain theory and neural networks
On the undecidability of probabilistic planning and infinite-horizon partially observable Markov decision problems

AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Initial experiments in stochastic satisfiability

AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
The Complexity of Optimal Small Policies

Mathematics of Operations Research
Learning Policies with External Memory

ICML '99 Proceedings of the Sixteenth International Conference on Machine Learning
On the Complexity of Finite Memory Policies for Markov Decision Processes

MFCS '95 Proceedings of the 20th International Symposium on Mathematical Foundations of Computer Science
The Complexity of Policy Evaluation for Finite-Horizon Partially-Observable Markov Decision Processes

MFCS '97 Proceedings of the 22nd International Symposium on Mathematical Foundations of Computer Science
On Probabilistic Tape Complexity and Fast Circuits for Matrix Inversion Problems (Extended Abstract)

Proceedings of the 11th Colloquium on Automata, Languages and Programming
On Probabilistic Time and Space

Proceedings of the 12th Colloquium on Automata, Languages and Programming
Complexity Issues in Markov Decision Processes

COCO '98 Proceedings of the Thirteenth Annual IEEE Conference on Computational Complexity
Dynamic Programming

Dynamic Programming
Efficient dynamic-programming updates in partially observable Markov decision processes

Efficient dynamic-programming updates in partially observable Markov decision processes
Algorithms for sequential decision-making

Algorithms for sequential decision-making
Exact and approximate algorithms for partially observable markov decision processes

Exact and approximate algorithms for partially observable markov decision processes
Planning and control in stochastic domains with imperfect information

Planning and control in stochastic domains with imperfect information
Finite-memory control of partially observable systems

Finite-memory control of partially observable systems
Integration of partially observable markov decision processes and reinforcement learning for simulated robot navigation

Integration of partially observable markov decision processes and reinforcement learning for simulated robot navigation
A model approximation scheme for planning in partially observable stochastic domains

Journal of Artificial Intelligence Research
The computational complexity of probabilistic planning

Journal of Artificial Intelligence Research
Probabilistic robot navigation in partially observable environments

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Approximating optimal policies for partially observable stochastic domains

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Exploiting structure in policy construction

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Probabilistic propositional planning: representations and complexity

AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
A temporal Bayesian network for diagnosis and prediction

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
An application of uncertain reasoning to requirements engineering

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Artificial decision making under uncertainty in intelligent buildings

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Continuous value function approximation for sequential bidding policies

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
SPUDD: stochastic planning using decision diagrams

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Attention-sensitive alerting

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Bayesian poker

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
My brain is full: when more memory helps

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Solving POMDPs by searching the space of finite policies

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Learning finite-state controllers for partially observable environments

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Bayes nets in educational assessment: Where the numbers come from

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
The decision-theoretic interactive video advisor

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Bayesian networks for dependability analysis: an application to digital control reliability

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Learning hidden Markov models with geometrical constraints

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
How to elicit many probabilities

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
A method for speeding up value iteration in partially observable Markov decision processes

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Solving POMDPs by searching in policy space

UAI'98 Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence
On the complexity of solving Markov decision problems

UAI'95 Proceedings of the Eleventh conference on Uncertainty in artificial intelligence
Incremental pruning: a simple, fast, exact method for partially observable Markov decision processes

UAI'97 Proceedings of the Thirteenth conference on Uncertainty in artificial intelligence
The complexity of plan existence and evaluation in robabilistic domains

UAI'97 Proceedings of the Thirteenth conference on Uncertainty in artificial intelligence

Stochastic Boolean Satisfiability

Journal of Automated Reasoning
The Complexity of Decentralized Control of Markov Decision Processes

Mathematics of Operations Research
Nearly deterministic abstractions of Markov decision processes

Eighteenth national conference on Artificial intelligence
The size of MDP factored policies

Eighteenth national conference on Artificial intelligence
Greedy linear value-approximation for factored Markov decision processes

Eighteenth national conference on Artificial intelligence
Transition-independent decentralized markov decision processes

AAMAS '03 Proceedings of the second international joint conference on Autonomous agents and multiagent systems
On the undecidability of probabilistic planning and related stochastic optimization problems

Artificial Intelligence - special issue on planning with uncertainty and incomplete information
When plans distinguish Bayes nets

International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
Decentralized Markov Decision Processes with Event-Driven Interactions

AAMAS '04 Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems - Volume 1
Bayesian sparse sampling for on-line reward optimization

ICML '05 Proceedings of the 22nd international conference on Machine learning
A Logical Framework to Reinforcement Learning Using Hybrid Probabilistic Logic Programs

SUM '08 Proceedings of the 2nd international conference on Scalable Uncertainty Management
Scheduling on-demand broadcast with timing constraints

Journal of Parallel and Distributed Computing
Compact, convex upper bound iteration for approximate POMDP planning

AAAI'06 proceedings of the 21st national conference on Artificial intelligence - Volume 2
Purely epistemic markov decision processes

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 2
An interaction-based approach to computational epidemiology

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 3
Nonapproximability results for partially observable Markov decision processes

Journal of Artificial Intelligence Research
On polynomial sized MDP succinct policies

Journal of Artificial Intelligence Research
Solving transition independent decentralized Markov decision processes

Journal of Artificial Intelligence Research
Complexity of probabilistic planning under average rewards

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 1
Conditional planning in the discrete belief space

IJCAI'05 Proceedings of the 19th international joint conference on Artificial intelligence
Algorithms for memory hierarchies: advanced lectures

Algorithms for memory hierarchies: advanced lectures
Simple model-based exploration and exploitation of Markov decision processes using the elimination algorithm

MICAI'07 Proceedings of the artificial intelligence 6th Mexican international conference on Advances in artificial intelligence
Deterministic POMDPs revisited

UAI '09 Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence
Automated planning for remote penetration testing

MILCOM'09 Proceedings of the 28th IEEE conference on Military communications
On model checking techniques for randomized distributed systems

IFM'10 Proceedings of the 8th international conference on Integrated formal methods
Modeling interaction between individuals, social networks and public policy to support public health epidemiology

Winter Simulation Conference
Upper confidence trees with short term partial information

EvoApplications'11 Proceedings of the 2011 international conference on Applications of evolutionary computation - Volume Part I
Learning to act optimally in partially observable Markov decision processes using hybrid probabilistic logic programs

SUM'11 Proceedings of the 5th international conference on Scalable uncertainty management
A natural language argumentation interface for explanation generation in Markov decision processes

ADT'11 Proceedings of the Second international conference on Algorithmic decision theory
Teaching randomized learners

COLT'06 Proceedings of the 19th annual conference on Learning Theory
Admission control policies for a multi-class QoS-aware service oriented architecture

ACM SIGMETRICS Performance Evaluation Review
On the Computational Complexity of Stochastic Controller Optimization in POMDPs

ACM Transactions on Computation Theory (TOCT)
Indemics: An interactive high-performance computing framework for data-intensive epidemic modeling

ACM Transactions on Modeling and Computer Simulation (TOMACS) - Special issue on simulation in complex service systems
An English-Language Argumentation Interface for Explanation Generation with Markov Decision Processes in the Domain of Academic Advising

ACM Transactions on Interactive Intelligent Systems (TiiS)
Scalable and efficient bayes-adaptive reinforcement learning based on monte-carlo tree search

Journal of Artificial Intelligence Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

Controlled stochastic systems occur in science engineering, manufacturing, social sciences, and many other cntexts. If the systems is modeled as a Markov decision process (MDP) and will run ad infinitum, the optimal control policy can be computed in polynomial time using linear programming. The problems considered here assume that the time that the process will run is finite, and based on the size of the input. There are mny factors that compound the complexity of computing the optimal policy. For instance, there are many factors that compound the complexity of this computation. For instance, if the controller does not have complete information about the state of the system, or if the system is represented in some very succint manner, the optimal policy is provably not computable in time polynomial in the size of the input. We analyze the computational complexity of evaluating policies and of determining whether a sufficiently good policy exists for a MDP, based on a number of confounding factors, including the observability of the system state; the succinctness of the representation; the type of policy; even the number of actions relative to the number of states. In almost every case, we show that the decision problem is complete for some known complexity class. Some of these results are familiar from work by Papadimitriou and Tsitsiklis and others, but some, such as our PL-completeness proofs, are surprising. We include proofs of completeness for natural problems in the as yet little-studied classes NPPP.