Restricted value iteration: theory and algorithms

Authors:
Weihong Zhang;Nevin L. Zhang
Affiliations:
Department of Computer Science, Washington University, Saint Louis, MO;Department of Computer Science, Hong Kong University of Science & Technology, Kowloon, Hong Kong, China
Venue:
Journal of Artificial Intelligence Research
Year:
2005

Citing 29
Cited 1

Data networks

Data networks
The complexity of Markov decision processes

Mathematics of Operations Research
Computationally feasible bounds for partially observed Markov decision processes

Operations Research
Learning to act using real-time dynamic programming

Artificial Intelligence - Special volume on computational research on interaction and agency, part 1
On the complexity of partially observed Markov decision processes

Theoretical Computer Science - Special issue on complexity theory and the theory of algorithms as developed in the CIS
Planning and acting in partially observable stochastic domains

Artificial Intelligence
Heuristic search in cyclic AND/OR graphs

AAAI '98/IAAI '98 Proceedings of the fifteenth national/tenth conference on Artificial intelligence/Innovative applications of artificial intelligence
Markov Decision Processes: Discrete Stochastic Dynamic Programming

Markov Decision Processes: Discrete Stochastic Dynamic Programming
Policy Iteration for Factored MDPs

UAI '00 Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence
BI-POMDP: Bounded, Incremental, Partially-Observable Markov-Model Planning

ECP '97 Proceedings of the 4th European Conference on Planning: Recent Advances in AI Planning
Dynamic Programming

Dynamic Programming
Efficient dynamic-programming updates in partially observable Markov decision processes

Efficient dynamic-programming updates in partially observable Markov decision processes
Exact and approximate algorithms for partially observable markov decision processes

Exact and approximate algorithms for partially observable markov decision processes
Finite-memory control of partially observable systems

Finite-memory control of partially observable systems
Algorithms for partially observable markov decision processes

Algorithms for partially observable markov decision processes
Value-function approximations for partially observable Markov decision processes

Journal of Artificial Intelligence Research
Speeding up the convergence of value iteration in partially observable Markov decision processes

Journal of Artificial Intelligence Research
A model approximation scheme for planning in partially observable stochastic domains

Journal of Artificial Intelligence Research
The computational complexity of probabilistic planning

Journal of Artificial Intelligence Research
Point-based value iteration: an anytime algorithm for POMDPs

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence
An improved grid-based approximation algorithm for POMDPs

IJCAI'01 Proceedings of the 17th international joint conference on Artificial intelligence - Volume 1
Approximating optimal policies for partially observable stochastic domains

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Decomposition techniques for planning in stochastic domains

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
A POMDP approximation algorithm that anticipates the need to observe

PRICAI'00 Proceedings of the 6th Pacific Rim international conference on Artificial intelligence
Computing optimal policies for partially observable decision processes using compact representations

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 2
Incremental methods for computing bounds in partially observable Markov decision processes

AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
Structured reachability analysis for Markov decision processes

UAI'98 Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence
Flexible decomposition algorithms for weakly coupled Markov decision problems

UAI'98 Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence
Incremental pruning: a simple, fast, exact method for partially observable Markov decision processes

UAI'97 Proceedings of the Thirteenth conference on Uncertainty in artificial intelligence

Partially observable Markov decision processes with imprecise parameters

Artificial Intelligence

Quantified Score

Hi-index	0.00

Visualization

Abstract

Value iteration is a popular algorithm for finding near optimal policies for POMDPs. It is ineffcient due to the need to account for the entire belief space, which necessitates the solution of large numbers of linear programs. In this paper, we study value iteration restricted to belief subsets. We show that, together with properly chosen belief subsets, restricted value iteration yields near-optimal policies and we give a condition for determining whether a given belief subset would bring about savings in space and time. We also apply restricted value iteration to two interesting classes of POMDPs, namely informative POMDPs and near-discernible POMDPs.