Improving POMDP tractability via belief compression and clustering

Authors:
Xin Li;William K. Cheung;Jiming Liu
Affiliations:
Department of Computer Science, Hong Kong Baptist University, Kowloon Tong, Hong Kong;Department of Computer Science, Hong Kong Baptist University, Kowloon Tong, Hong Kong;Department of Computer Science, Hong Kong Baptist University, Kowloon Tong, Hong Kong
Venue:
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Year:
2010

Citing 10
Cited 2

Efficient dynamic-programming updates in partially observable Markov decision processes

Efficient dynamic-programming updates in partially observable Markov decision processes
Exact and approximate algorithms for partially observable markov decision processes

Exact and approximate algorithms for partially observable markov decision processes
Orthogonal nonnegative matrix t-factorizations for clustering

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Point-Based Value Iteration for Continuous POMDPs

The Journal of Machine Learning Research
Scaling up: solving POMDPs through value based clustering

AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 2
Value-function approximations for partially observable Markov decision processes

Journal of Artificial Intelligence Research
Finding approximate POMDP solutions through belief compression

Journal of Artificial Intelligence Research
Perseus: randomized point-based value iteration for POMDPs

Journal of Artificial Intelligence Research
Probabilistic robot navigation in partially observable environments

IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Prioritizing Point-Based POMDP Solvers

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics

On Compressibility and Acceleration of Orthogonal NMF for POMDP Compression

ACML '09 Proceedings of the 1st Asian Conference on Machine Learning: Advances in Machine Learning
Observable subspace solution for irreducible POMDPs with infinite horizon

Proceedings of the Seventh Annual Workshop on Cyber Security and Information Intelligence Research

Quantified Score

Hi-index	0.00

Visualization

Abstract

Partially observable Markov decision process (POMDP) is a commonly adopted mathematical framework for solving planning problems in stochastic environments. However, computing the optimal policy of POMDP for large-scale problems is known to be intractable, where the high dimensionality of the underlying belief space is one of the major causes. In this paper, we propose a hybrid approach that integrates two different approaches for reducing the dimensionality of the belief space: 1) belief compression and 2) value-directed compression. In particular, a novel orthogonal nonnegative matrix factorization is derived for the belief compression, which is then integrated in a value-directed framework for computing the policy. In addition, with the conjecture that a properly partitioned belief space can have its per-cluster intrinsic dimension further reduced, we propose to apply a k-means-like clustering technique to partition the belief space to form a set of sub-POMDPs before applying the dimension reduction techniques to each of them. We have evaluated the proposed belief compression and clustering approaches based on a set of benchmark problems and demonstrated their effectiveness in reducing the cost for computing policies, with the quality of the policies being retained.