Efficient dynamic-programming updates in partially observable Markov decision processes

Authors:
Michael L. Littman;Anthony R. Cassandra;Leslie P Kaelbling
Affiliations:
-;-;-
Venue:
Efficient dynamic-programming updates in partially observable Markov decision processes
Year:
1995

Citing 0
Cited 20

Complexity of finite-horizon Markov decision process problems

Journal of the ACM (JACM)
Reactive Navigation Using Reinforment Learning in Situations of POMDPs

IWANN '01 Proceedings of the 6th International Work-Conference on Artificial and Natural Neural Networks: Bio-inspired Applications of Connectionism-Part II
Hidden-Mode Markov Decision Processes for Nonstationary Sequential Decision Making

Sequence Learning - Paradigms, Algorithms, and Applications
Region-based incremental pruning for POMDPs

UAI '04 Proceedings of the 20th conference on Uncertainty in artificial intelligence
Abstract interpretation of programs as Markov decision processes

Science of Computer Programming - Special issue: Static analysis symposium (SAS 2003)
Safe Q-Learning on Complete History Spaces

ECML '07 Proceedings of the 18th European conference on Machine Learning
Efficient maximization in solving POMDPs

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2
Scaling up: solving POMDPs through value based clustering

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 2
Speeding up the convergence of value iteration in partially observable Markov decision processes

Journal of Artificial Intelligence Research
Nonapproximability results for partially observable Markov decision processes

Journal of Artificial Intelligence Research
Restricted value iteration: theory and algorithms

Journal of Artificial Intelligence Research
A model approximation scheme for planning in partially observable stochastic domains

Journal of Artificial Intelligence Research
Planning and acting in partially observable stochastic domains

Artificial Intelligence
Improving POMDP tractability via belief compression and clustering

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Abstract interpretation of programs as Markov decision processes

SAS'03 Proceedings of the 10th international conference on Static analysis
A new graphical recursive pruning method for the incremental pruning algorithm

MICAI'10 Proceedings of the 9th Mexican international conference on Advances in artificial intelligence: Part I
Solving POMDPs by searching in policy space

UAI'98 Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence
Planning with partially observable Markov decision processes: advances in exact solution method

UAI'98 Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence
Incremental pruning: a simple, fast, exact method for partially observable Markov decision processes

UAI'97 Proceedings of the Thirteenth conference on Uncertainty in artificial intelligence
Region-based approximations for planning in stochastic domains

UAI'97 Proceedings of the Thirteenth conference on Uncertainty in artificial intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

We examine the problem of performing exact dynamic-programming updates in partially observable Markov decision processes (POMDPs) from a computational complexity viewpoint. Dynamic-programming updates are a crucial operation in a wide range of POMDP solution methods and we find that it is intractable to perform these updates on piecewise-linear convex value functions for general POMDPs. We offer a new algorithm, called the witness algorithm, which can compute updated value functions efficiently on a restricted class of POMDPs in which the number of linear facets is not too great. We compare the witness algorithm to existing algorithms analytically and empirically and find that it is the fastest algorithm over a wide range of POMDP sizes.