A novel orthogonal NMF-based belief compression for POMDPs

Authors:
Xin Li;William K. W. Cheung;Jiming Liu;Zhili Wu
Affiliations:
Hong Kong Baptist University, Kowloon Tong, HK;Hong Kong Baptist University, Kowloon Tong, HK;Hong Kong Baptist University, Kowloon Tong, HK;Hong Kong Baptist University, Kowloon Tong, HK
Venue:
Proceedings of the 24th international conference on Machine learning
Year:
2007

Citing 6
Cited 1

Exact and approximate algorithms for partially observable markov decision processes

Exact and approximate algorithms for partially observable markov decision processes
Decomposing Large-Scale POMDP Via Belief State Analysis

IAT '05 Proceedings of the IEEE/WIC/ACM International Conference on Intelligent Agent Technology
Orthogonal nonnegative matrix t-factorizations for clustering

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Finding approximate POMDP solutions through belief compression

Journal of Artificial Intelligence Research
Perseus: randomized point-based value iteration for POMDPs

Journal of Artificial Intelligence Research
Point-based value iteration: an anytime algorithm for POMDPs

IJCAI'03 Proceedings of the 18th international joint conference on Artificial intelligence

Efficient planning in large POMDPs through policy graph based factorized approximations

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part III

Quantified Score

Hi-index	0.00

Visualization

Abstract

High dimensionality of POMDP's belief state space is one major cause that makes the underlying optimal policy computation intractable. Belief compression refers to the methodology that projects the belief state space to a low-dimensional one to alleviate the problem. In this paper, we propose a novel orthogonal non-negative matrix factorization (O-NMF) for the projection. The proposed O-NMF not only factors the belief state space by minimizing the reconstruction error, but also allows the compressed POMDP formulation to be efficiently computed (due to its orthogonality) in a value-directed manner so that the value function will take same values for corresponding belief states in the original and compressed state spaces. We have tested the proposed approach using a number of benchmark problems and the empirical results confirms its effectiveness in achieving substantial computational cost saving in policy computation.