Value iteration working with belief subset

Authors:
Weixiong Zhang;Nevin L. Zhang
Affiliations:
Computational Intelligence Center and Department of Computer Science, Washington University, St. Louis, MO;Department of Computer Science, Hong Kong University of Science and Technology, Clear Water Bay, Kowloon, Hong Kong
Venue:
Eighteenth national conference on Artificial intelligence
Year:
2002

Citing 7
Cited 1

Computationally feasible bounds for partially observed Markov decision processes

Operations Research
Exact and approximate algorithms for partially observable markov decision processes

Exact and approximate algorithms for partially observable markov decision processes
Algorithms for partially observable markov decision processes

Algorithms for partially observable markov decision processes
A model approximation scheme for planning in partially observable stochastic domains

Journal of Artificial Intelligence Research
Decomposition techniques for planning in stochastic domains

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Model minimization in Markov decision processes

AAAI'97/IAAI'97 Proceedings of the fourteenth national conference on artificial intelligence and ninth conference on Innovative applications of artificial intelligence
Structured reachability analysis for Markov decision processes

UAI'98 Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence

Efficient maximization in solving POMDPs

AAAI'05 Proceedings of the 20th national conference on Artificial intelligence - Volume 2

Quantified Score

Hi-index	0.00

Visualization

Abstract

Value iteration is a popular algorithm for solving POMDPs. However, it is inefficient in practice. The primary reason is that it needs to conduct value updates for all the belief states in the (continuous) belief space. In this paper, we study value iteration working with a subset of the belief space, i.e., it conducts value updates only for belief states in the subset. We present a way to select belief subset and describe an algorithm to conduct value iteration over the selected subset. The algorithm is attractive in that it works with belief subset but also retains the quality of the generated values. Given a POMDP, we show how to a priori determine whether the selected subset is a proper subset of belief space. If this is the case, the algorithm carries the advantages of representation in space and efficiency in time.