On the complexity of solving Markov decision problems

Authors:
Michael L. Littman;Thomas L. Dean;Leslie Pack Kaelbling
Affiliations:
Department of Computer Science, Brown University, Providence, RI;Department of Computer Science, Brown University, Providence, RI;Department of Computer Science, Brown University, Providence, RI
Venue:
UAI'95 Proceedings of the Eleventh conference on Uncertainty in artificial intelligence
Year:
1995

Citing 23
Cited 56

A new polynomial-time algorithm for linear programming

Combinatorica
Theory of linear and integer programming

Theory of linear and integer programming
Dynamic programming: deterministic and stochastic models

Dynamic programming: deterministic and stochastic models
Matrix multiplication via arithmetic progressions

STOC '87 Proceedings of the nineteenth annual ACM symposium on Theory of computing
The complexity of Markov decision processes

Mathematics of Operations Research
Conductance and the rapid mixing property for Markov chains: the approximation of permanent resolved

STOC '88 Proceedings of the twentieth annual ACM symposium on Theory of computing
Fast approximation algorithms for multicommodity flow problems

STOC '91 Proceedings of the twenty-third annual ACM symposium on Theory of computing
Fast approximation algorithms for fractional packing and covering problems

SFCS '91 Proceedings of the 32nd annual symposium on Foundations of computer science
A subexponential randomized simplex algorithm (extended abstract)

STOC '92 Proceedings of the twenty-fourth annual ACM symposium on Theory of computing
Approximating probabilistic inference in Bayesian belief networks is NP-hard

Artificial Intelligence
Efficient learning and planning within the Dyna framework

Adaptive Behavior
Planning under time constraints in stochastic domains

Artificial Intelligence - Special volume on planning and scheduling
The Parti-game Algorithm for Variable Resolution Reinforcement Learning in Multidimensional State-spaces

Machine Learning
Predicting real-time planner performance by domain characterization

Predicting real-time planner performance by domain characterization
Dynamic Programming: Models and Applications

Dynamic Programming: Models and Applications
Finite State Markovian Decision Processes

Finite State Markovian Decision Processes
Aggregation Methods for Large Markov Chains

Proceedings of the International Workshop on Computer Performance and Reliability
Memory-Based Reinforcement Learning: Efficient Computation with Prioritized Sweeping

Advances in Neural Information Processing Systems 5, [NIPS Conference]
An optimal algorithm for Monte Carlo estimation

FOCS '95 Proceedings of the 36th Annual Symposium on Foundations of Computer Science
Exploiting structure in policy construction

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Decomposition techniques for planning in stochastic domains

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Learning to act using real-time dynamic programming

Artificial Intelligence
Planning with deadlines in stochastic domains

AAAI'93 Proceedings of the eleventh national conference on Artificial intelligence

Complexity of finite-horizon Markov decision process problems

Journal of the ACM (JACM)
A new decomposition technique for solving Markov decision processes

Proceedings of the 2001 ACM symposium on Applied computing
Meta-learning in reinforcement learning

Neural Networks
The size of MDP factored policies

Eighteenth national conference on Artificial intelligence
Mobile Robotics Planning Using Abstract Markov Decision Processes

ICTAI '99 Proceedings of the 11th IEEE International Conference on Tools with Artificial Intelligence
Constructing optimal policies for agents with constrained architectures

AAMAS '03 Proceedings of the second international joint conference on Autonomous agents and multiagent systems
On the undecidability of probabilistic planning and related stochastic optimization problems

Artificial Intelligence - special issue on planning with uncertainty and incomplete information
Equivalence notions and model minimization in Markov decision processes

Artificial Intelligence - special issue on planning with uncertainty and incomplete information
Quantitative stochastic parity games

SODA '04 Proceedings of the fifteenth annual ACM-SIAM symposium on Discrete algorithms
Semantic email

Proceedings of the 13th international conference on World Wide Web
A New Complexity Result on Solving the Markov Decision Problem

Mathematics of Operations Research
Model-based function approximation in reinforcement learning

Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
Dynamics based control with an application to area-sweeping problems

Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
Efficient structured policies for admission control in heterogeneous wireless networks

Mobile Networks and Applications
Dynamics based control with PSRs

Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
Controlling deliberation in a Markov decision process-based agent

Proceedings of the 7th international joint conference on Autonomous agents and multiagent systems - Volume 1
A Cultural Algorithm for POMDPs from Stochastic Inventory Control

HM '08 Proceedings of the 5th International Workshop on Hybrid Metaheuristics
Pattern Learning and Decision Making in a Photovoltaic System

SEAL '08 Proceedings of the 7th International Conference on Simulated Evolution and Learning
Optimal admission control policies for heterogeneous wireless networks

The Fourth International Conference on Heterogeneous Networking for Quality, Reliability, Security and Robustness & Workshops
Parallel Algorithms for Solving Markov Decision Process

ICA3PP '09 Proceedings of the 9th International Conference on Algorithms and Architectures for Parallel Processing
Partitioned external-memory value iteration

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
Active mobile robot localization

IJCAI'97 Proceedings of the Fifteenth international joint conference on Artifical intelligence - Volume 2
Nonapproximability results for partially observable Markov decision processes

Journal of Artificial Intelligence Research
On polynomial sized MDP succinct policies

Journal of Artificial Intelligence Research
Resource allocation among agents with MDP-induced preferences

Journal of Artificial Intelligence Research
Reinforcement learning: a survey

Journal of Artificial Intelligence Research
Topological value iteration algorithm for Markov decision processes

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Modular self-organization for a long-living autonomous agent

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Intensional dynamic programming. A Rosetta stone for structured dynamic programming

Journal of Algorithms
A Strongly Polynomial Algorithm for Controlled Queues

Mathematics of Operations Research
Domain-independent, automatic partitioning for probabilistic planning

IJCAI'09 Proceedings of the 21st international jont conference on Artifical intelligence
Finding Best k Policies

ADT '09 Proceedings of the 1st International Conference on Algorithmic Decision Theory
Discounted deterministic Markov decision processes and discounted all-pairs shortest paths

ACM Transactions on Algorithms (TALG)
Semantic email: theory and applications

Web Semantics: Science, Services and Agents on the World Wide Web
Asynchronous neurocomputing for optimal control and reinforcement learning with large state spaces

Neurocomputing
Parallelizing parallel rollout algorithm for solving Markov decision processes

WOMPAT'03 Proceedings of the OpenMP applications and tools 2003 international conference on OpenMP shared memory parallel programming
Model-based exploration in continuous state spaces

SARA'07 Proceedings of the 7th International conference on Abstraction, reformulation, and approximation
Evolution of resource reciprocation strategies in P2P networks

IEEE Transactions on Signal Processing
Critical factors in the empirical performance of temporal difference and evolutionary methods for reinforcement learning

Autonomous Agents and Multi-Agent Systems
Multiscale Adaptive Agent-Based Management of Storage-Enabled Photovoltaic Facilities

Proceedings of the 2010 conference on ECAI 2010: 19th European Conference on Artificial Intelligence
Rewarding behaviors

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 2
AdQL - anomaly detection Q-learning in control multi-queue systems with QoS constraints

KES-AMSTA'10 Proceedings of the 4th KES international conference on Agent and multi-agent systems: technologies and applications, Part II
Pagerank optimization in polynomial time by stochastic shortest path reformulation

ALT'10 Proceedings of the 21st international conference on Algorithmic learning theory
Ranking policies in discrete Markov decision processes

Annals of Mathematics and Artificial Intelligence
On minimizing ordered weighted regrets in multiobjective Markov decision processes

ADT'11 Proceedings of the Second international conference on Algorithmic decision theory
The Simplex and Policy-Iteration Methods Are Strongly Polynomial for the Markov Decision Problem with a Fixed Discount Rate

Mathematics of Operations Research
Time-critical action: representations and application

UAI'97 Proceedings of the Thirteenth conference on Uncertainty in artificial intelligence
Plan development using local probabilistic models

UAI'96 Proceedings of the Twelfth international conference on Uncertainty in artificial intelligence
New prioritized value iteration for Markov decision processes

Artificial Intelligence Review
Topological value iteration algorithms

Journal of Artificial Intelligence Research
Survey A survey of computational complexity results in systems and control

Automatica (Journal of IFAC)
A call admission control for service differentiation and fairness management in WDM grooming networks

Optical Switching and Networking
Strategy Iteration Is Strongly Polynomial for 2-Player Turn-Based Stochastic Games with a Constant Discount Factor

Journal of the ACM (JACM)
Near-optimal continuous patrolling with teams of mobile information gathering agents

Artificial Intelligence
Modular value iteration through regional decomposition

AGI'12 Proceedings of the 5th international conference on Artificial General Intelligence
Interactive value iteration for Markov decision processes with unknown rewards

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence

Quantified Score

Hi-index	0.01

Visualization

Abstract

Markov decision problems (MDPs) provide the foundations for a number of problems of interest to AI researchers studying automated planning and reinforcement learning. In this paper, we summarize results regarding the complexity of solving MDPs and the running time of MDP solution algorithms. We argue that, although MDPs can be solved efficiently in theory, more study is needed to reveal practical algorithms for solving large problems quickly. To encourage future research, we sketch some alternative methods of analysis that rely on the structure of MDPs.