Efficient dynamic-programming updates in partially observable Markov decision processes
Efficient dynamic-programming updates in partially observable Markov decision processes
Exact and approximate algorithms for partially observable markov decision processes
Exact and approximate algorithms for partially observable markov decision processes
Orthogonal nonnegative matrix t-factorizations for clustering
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Point-Based Value Iteration for Continuous POMDPs
The Journal of Machine Learning Research
Scaling up: solving POMDPs through value based clustering
AAAI'07 Proceedings of the 22nd national conference on Artificial intelligence - Volume 2
Value-function approximations for partially observable Markov decision processes
Journal of Artificial Intelligence Research
Finding approximate POMDP solutions through belief compression
Journal of Artificial Intelligence Research
Perseus: randomized point-based value iteration for POMDPs
Journal of Artificial Intelligence Research
Probabilistic robot navigation in partially observable environments
IJCAI'95 Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
Prioritizing Point-Based POMDP Solvers
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
On Compressibility and Acceleration of Orthogonal NMF for POMDP Compression
ACML '09 Proceedings of the 1st Asian Conference on Machine Learning: Advances in Machine Learning
Observable subspace solution for irreducible POMDPs with infinite horizon
Proceedings of the Seventh Annual Workshop on Cyber Security and Information Intelligence Research
Hi-index | 0.00 |
Partially observable Markov decision process (POMDP) is a commonly adopted mathematical framework for solving planning problems in stochastic environments. However, computing the optimal policy of POMDP for large-scale problems is known to be intractable, where the high dimensionality of the underlying belief space is one of the major causes. In this paper, we propose a hybrid approach that integrates two different approaches for reducing the dimensionality of the belief space: 1) belief compression and 2) value-directed compression. In particular, a novel orthogonal nonnegative matrix factorization is derived for the belief compression, which is then integrated in a value-directed framework for computing the policy. In addition, with the conjecture that a properly partitioned belief space can have its per-cluster intrinsic dimension further reduced, we propose to apply a k-means-like clustering technique to partition the belief space to form a set of sub-POMDPs before applying the dimension reduction techniques to each of them. We have evaluated the proposed belief compression and clustering approaches based on a set of benchmark problems and demonstrated their effectiveness in reducing the cost for computing policies, with the quality of the policies being retained.