The complexity of Markov decision processes

Authors:
Christos Papadimitriou;John N. Tsitsiklis
Affiliations:
Stanford Univ., Stanford, CA;Massachusetts Institute of Technology
Venue:
Mathematics of Operations Research
Year:
1987

Citing 0
Cited 145

Efficient reinforcement learning

COLT '94 Proceedings of the seventh annual conference on Computational learning theory
Distinguishing tests for nondeterministic and probabilistic machines

STOC '95 Proceedings of the twenty-seventh annual ACM symposium on Theory of computing
Reinforcement learning and mistake bounded algorithms

COLT '99 Proceedings of the twelfth annual conference on Computational learning theory
On the undecidability of probabilistic planning and infinite-horizon partially observable Markov decision problems

AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Initial experiments in stochastic satisfiability

AAAI '99/IAAI '99 Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence
Complexity of finite-horizon Markov decision process problems

Journal of the ACM (JACM)
A new decomposition technique for solving Markov decision processes

Proceedings of the 2001 ACM symposium on Applied computing
Communication decisions in multi-agent cooperation: model and experiments

Proceedings of the fifth international conference on Autonomous agents
Multi-agent policies: from centralized ones to decentralized ones

Proceedings of the first international joint conference on Autonomous agents and multiagent systems: part 3
Markov decision processes and deterministic Büchi automata

Fundamenta Informaticae
Stochastic Boolean Satisfiability

Journal of Automated Reasoning
Reinforcement Learning Agents

Artificial Intelligence Review
Exploration Strategies for Model-based Learning in Multi-agent Systems: Exploration Strategies

Autonomous Agents and Multi-Agent Systems
The Complexity of Decentralized Control of Markov Decision Processes

Mathematics of Operations Research
On probabilistic timed automata

Theoretical Computer Science
Characterizing Markov Decision Processes

ECML '02 Proceedings of the 13th European Conference on Machine Learning
Value Iteration over Belief Subspace

ECSQARU '01 Proceedings of the 6th European Conference on Symbolic and Quantitative Approaches to Reasoning with Uncertainty
Space-Progressive Value Iteration: An Anytime Algorithm for a Class of POMDPs

ECSQARU '01 Proceedings of the 6th European Conference on Symbolic and Quantitative Approaches to Reasoning with Uncertainty
LC-Learning: Phased Method for Average Reward Reinforcement Learning - Preliminary Results

PRICAI '02 Proceedings of the 7th Pacific Rim International Conference on Artificial Intelligence: Trends in Artificial Intelligence
Hidden-Mode Markov Decision Processes for Nonstationary Sequential Decision Making

Sequence Learning - Paradigms, Algorithms, and Applications
Bounds on Sample Size for Policy Evaluation in Markov Environments

COLT '01/EuroCOLT '01 Proceedings of the 14th Annual Conference on Computational Learning Theory and and 5th European Conference on Computational Learning Theory
The size of MDP factored policies

Eighteenth national conference on Artificial intelligence
Mobile Robotics Planning Using Abstract Markov Decision Processes

ICTAI '99 Proceedings of the 11th IEEE International Conference on Tools with Artificial Intelligence
Transition-independent decentralized markov decision processes

AAMAS '03 Proceedings of the second international joint conference on Autonomous agents and multiagent systems
Constructing optimal policies for agents with constrained architectures

AAMAS '03 Proceedings of the second international joint conference on Autonomous agents and multiagent systems
On the undecidability of probabilistic planning and related stochastic optimization problems

Artificial Intelligence - special issue on planning with uncertainty and incomplete information
Solving factored MDPs using non-homogeneous partitions

Artificial Intelligence - special issue on planning with uncertainty and incomplete information
When plans distinguish Bayes nets

International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
A Human–Robot Cooperative Learning System for Easy Installation of Assistant Robots in New Working Environments

Journal of Intelligent and Robotic Systems
Decentralized Markov Decision Processes with Event-Driven Interactions

AAMAS '04 Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems - Volume 1
Fitting and Compilation of Multiagent Models through Piecewise Linear Functions

AAMAS '04 Proceedings of the Third International Joint Conference on Autonomous Agents and Multiagent Systems - Volume 3
A Navigation System for Assistant Robots Using Visually Augmented POMDPs

Autonomous Robots
Reasoning about joint beliefs for execution-time communication decisions

Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems
An online POMDP algorithm for complex multiagent environments

Proceedings of the fourth international joint conference on Autonomous agents and multiagent systems
Cooperative Multi-Agent Learning: The State of the Art

Autonomous Agents and Multi-Agent Systems
Strong planning under partial observability

Artificial Intelligence
Decentralized planning under uncertainty for teams of communicating agents

AAMAS '06 Proceedings of the fifth international joint conference on Autonomous agents and multiagent systems
Cost-sensitive feature acquisition and classification

Pattern Recognition
Point-Based Value Iteration for Continuous POMDPs

The Journal of Machine Learning Research
Partially observable Markov decision processes with imprecise parameters

Artificial Intelligence
Selecting treatment strategies with dynamic limited-memory influence diagrams

Artificial Intelligence in Medicine
Q-value functions for decentralized POMDPs

Proceedings of the 6th international joint conference on Autonomous agents and multiagent systems
Supporting Multiple User Types with a Multimodal Dialog Agent

WI-IATW '07 Proceedings of the 2007 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology - Workshops
Application of automata learning algorithms to robot motion tracking

ISPRA'05 Proceedings of the 4th WSEAS International Conference on Signal Processing, Robotics and Automation
Planning and Learning in Environments with Delayed Feedback

ECML '07 Proceedings of the 18th European conference on Machine Learning
Learning and planning in environments with delayed feedback

Autonomous Agents and Multi-Agent Systems
Discounted deterministic Markov decision processes and discounted all-pairs shortest paths

SODA '09 Proceedings of the twentieth Annual ACM-SIAM Symposium on Discrete Algorithms
Probabilistic Acceptors for Languages over Infinite Words

SOFSEM '09 Proceedings of the 35th Conference on Current Trends in Theory and Practice of Computer Science
Probabilistic planning with clear preferences on missing information

Artificial Intelligence
Reward shaping for valuing communications during multi-agent coordination

Proceedings of The 8th International Conference on Autonomous Agents and Multiagent Systems - Volume 1
Exploiting locality of interactions using a policy-gradient approach in multiagent learning

Proceedings of the 2008 conference on ECAI 2008: 18th European Conference on Artificial Intelligence
PPCP: efficient probabilistic planning with clear preferences in partially-known environments

AAAI'06 Proceedings of the 21st national conference on Artificial intelligence - Volume 1
Concavely-Priced Probabilistic Timed Automata

CONCUR 2009 Proceedings of the 20th International Conference on Concurrency Theory
A reinforcement learning algorithm with polynomial interaction complexity for only-costly-observable MDPs

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 1
Purely epistemic markov decision processes

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 2
A Markovian model for dynamic and constrained resource allocation problems

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 2
Value-function approximations for partially observable Markov decision processes

Journal of Artificial Intelligence Research
Speeding up the convergence of value iteration in partially observable Markov decision processes

Journal of Artificial Intelligence Research
Nonapproximability results for partially observable Markov decision processes

Journal of Artificial Intelligence Research
The communicative multiagent team decision problem: analyzing teamwork theories and models

Journal of Artificial Intelligence Research
Decision-theoretic bidding based on learned density models in simultaneous, interacting auctions

Journal of Artificial Intelligence Research
Complexity results and approximation strategies for MAP explanations

Journal of Artificial Intelligence Research
On polynomial sized MDP succinct policies

Journal of Artificial Intelligence Research
Decentralized control of cooperative systems: categorization and complexity analysis

Journal of Artificial Intelligence Research
Solving transition independent decentralized Markov decision processes

Journal of Artificial Intelligence Research
Restricted value iteration: theory and algorithms

Journal of Artificial Intelligence Research
Hybrid BDI-POMDP framework for multiagent teaming

Journal of Artificial Intelligence Research
A framework for sequential planning in multi-agent settings

Journal of Artificial Intelligence Research
Perseus: randomized point-based value iteration for POMDPs

Journal of Artificial Intelligence Research
Communication-based decomposition mechanisms for decentralized MDPs

Journal of Artificial Intelligence Research
Optimal and approximate Q-value functions for decentralized POMDPs

Journal of Artificial Intelligence Research
Online planning algorithms for POMDPs

Journal of Artificial Intelligence Research
The computational complexity of probabilistic planning

Journal of Artificial Intelligence Research
AEMS: an anytime online search algorithm for approximate policy refinement in large POMDPs

IJCAI'07 Proceedings of the 20th international joint conference on Artifical intelligence
Taming decentralized POMDPs: towards efficient policy computation for multiagent settings

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
Complexity of probabilistic planning under average rewards

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 1
Decomposition techniques for planning in stochastic domains

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Strong planning under partial observability

Artificial Intelligence
Rate adaptation using acknowledgement feedback in finite-state Markov channels with collisions

IEEE Transactions on Wireless Communications
Rate adaptation via link-layer feedback for goodput maximization over a time-varying channel

IEEE Transactions on Wireless Communications
Discounted deterministic Markov decision processes and discounted all-pairs shortest paths

ACM Transactions on Algorithms (TALG)
Partially Observable Markov Decision Processes: A Geometric Technique and Analysis

Operations Research
Model checking probabilistic timed automata with one or two clocks

TACAS'07 Proceedings of the 13th international conference on Tools and algorithms for the construction and analysis of systems
Q-learning with linear function approximation

COLT'07 Proceedings of the 20th annual conference on Learning theory
On the Topology of Discrete Strategies

International Journal of Robotics Research
Planning under Uncertainty for Robotic Tasks with Mixed Observability

International Journal of Robotics Research
Quasi deterministic POMDPs and DecPOMDPs

Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: volume 1 - Volume 1
OFDMA downlink resource allocation via ARQ feedback

Asilomar'09 Proceedings of the 43rd Asilomar conference on Signals, systems and computers
Automated planning for remote penetration testing

MILCOM'09 Proceedings of the 28th IEEE conference on Military communications
An investigation into mathematical programming for finite horizon decentralized POMDPs

Journal of Artificial Intelligence Research
A decision-theoretic formalism for belief-optimal reasoning

PerMIS '09 Proceedings of the 9th Workshop on Performance Metrics for Intelligent Systems
Complexity analysis of real-time reinforcement learning

AAAI'93 Proceedings of the eleventh national conference on Artificial intelligence
Probabilistic propositional planning: representations and complexity

AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
Qualitative analysis of partially-observable Markov decision processes

MFCS'10 Proceedings of the 35th international conference on Mathematical foundations of computer science
Learning models of intelligent agents

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1
Pagerank optimization in polynomial time by stochastic shortest path reformulation

ALT'10 Proceedings of the 21st international conference on Algorithmic learning theory
Quickest detection in multiple on-off processes

IEEE Transactions on Signal Processing
On model checking techniques for randomized distributed systems

IFM'10 Proceedings of the 8th international conference on Integrated formal methods
A behavior adaptation method for an elderly companion robot: Rui

ICSR'10 Proceedings of the Second international conference on Social robotics
A Modified Memory-Based Reinforcement Learning Method for Solving POMDP Problems

Neural Processing Letters
Decentralized MDPs with sparse interactions

Artificial Intelligence
The influence of random interactions and decision heuristics on norm evolution in social networks

Computational & Mathematical Organization Theory
LQG-MP: Optimized path planning for robots with motion uncertainty and imperfect state information

International Journal of Robotics Research
Evolving policies for multi-reward partially observable markov decision processes (MR-POMDPs)

Proceedings of the 13th annual conference on Genetic and evolutionary computation
Upper confidence trees with short term partial information

EvoApplications'11 Proceedings of the 2011 international conference on Applications of evolutionary computation - Volume Part I
Using mathematical programming to solve Factored Markov Decision Processes with Imprecise Probabilities

International Journal of Approximate Reasoning
HTN-style planning in relational POMDPs using first-order FSCs

KI'11 Proceedings of the 34th Annual German conference on Advances in artificial intelligence
The Simplex and Policy-Iteration Methods Are Strongly Polynomial for the Markov Decision Problem with a Fixed Discount Rate

Mathematics of Operations Research
Dynamic traffic splitting to parallel wireless networks with partial information: A Bayesian approach

Performance Evaluation
My brain is full: when more memory helps

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
Approximate planning for factored POMDPs using belief state simplification

UAI'99 Proceedings of the Fifteenth conference on Uncertainty in artificial intelligence
MAP complexity results and approximation methods

UAI'02 Proceedings of the Eighteenth conference on Uncertainty in artificial intelligence
The complexity of decentralized control of Markov decision processes

UAI'00 Proceedings of the Sixteenth conference on Uncertainty in artificial intelligence
Vector-space analysis of belief-state approximation for POMDPs

UAI'01 Proceedings of the Seventeenth conference on Uncertainty in artificial intelligence
A tractable POMDP for a class of sequencing problems

UAI'01 Proceedings of the Seventeenth conference on Uncertainty in artificial intelligence
On the complexity of solving Markov decision problems

UAI'95 Proceedings of the Eleventh conference on Uncertainty in artificial intelligence
The complexity of plan existence and evaluation in robabilistic domains

UAI'97 Proceedings of the Thirteenth conference on Uncertainty in artificial intelligence
Survey of Motion Planning Literature in the Presence of Uncertainty: Considerations for UAV Guidance

Journal of Intelligent and Robotic Systems
Evaluating and minimizing ambiguities in qualitative route instructions

Proceedings of the 19th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems
Dispatch-and-search: dynamic multi-ferry control in partitioned mobile networks

MobiHoc '11 Proceedings of the Twelfth ACM International Symposium on Mobile Ad Hoc Networking and Computing
Probabilistic ω-automata

Journal of the ACM (JACM)
A survey of stochastic ω-regular games

Journal of Computer and System Sciences
An online POMDP algorithm used by the policeforce agents in the robocuprescue simulation

RoboCup 2005
Quantitative access control with partially-observable Markov decision processes

Proceedings of the second ACM conference on Data and Application Security and Privacy
Exploiting symmetries for single- and multi-agent Partially Observable Stochastic Domains

Artificial Intelligence
An overview of cooperative and competitive multiagent learning

LAMAS'05 Proceedings of the First international conference on Learning and Adaption in Multi-Agent Systems
DynaMOC: a dynamic overlapping coalition-based multiagent system for coordination of mobile ad hoc devices

ICIRA'11 Proceedings of the 4th international conference on Intelligent Robotics and Applications - Volume Part I
Adaptive submodularity: theory and applications in active learning and stochastic optimization

Journal of Artificial Intelligence Research
Survey A survey of computational complexity results in systems and control

Automatica (Journal of IFAC)
Complexity of stability and controllability of elementary hybrid systems

Automatica (Journal of IFAC)
Localized policy-based target tracking using wireless sensor networks

ACM Transactions on Sensor Networks (TOSN)
Solving H-horizon, stationary Markov decision problems in time proportional to log(H)

Operations Research Letters
Cross-Layer Power Allocation for Packet Transmission Over Fading Channel

Wireless Personal Communications: An International Journal
Partial-Observation Stochastic Games: How to Win When Belief Fails

LICS '12 Proceedings of the 2012 27th Annual IEEE/ACM Symposium on Logic in Computer Science
Motion planning under uncertainty using iterative local optimization in belief space

International Journal of Robotics Research
Markov Decision Processes and Deterministic Büchi Automata

Fundamenta Informaticae
On the Computational Complexity of Stochastic Controller Optimization in POMDPs

ACM Transactions on Computation Theory (TOCT)
Modeling information exchange opportunities for effective human-computer teamwork

Artificial Intelligence
On the complexity of model checking interval-valued discrete time Markov chains

Information Processing Letters
Solving decentralized POMDP problems using genetic algorithms

Autonomous Agents and Multi-Agent Systems
LTL model checking of interval markov chains

TACAS'13 Proceedings of the 19th international conference on Tools and Algorithms for the Construction and Analysis of Systems
Decentralized multi-robot cooperation with auctioned POMDPs

International Journal of Robotics Research
A simple index rule for efficient traffic splitting over parallel wireless networks with partial information

Performance Evaluation
QoE-aware optimization of multimedia flow scheduling

Computer Communications
Optimal eviction policies for stochastic address traces

Theoretical Computer Science

Quantified Score

Hi-index	0.01

The complexity of Markov decision processes

Quantified Score

Visualization

Abstract