A Probability Model for Projective Clustering on High Dimensional Data

  • Authors:
  • Lifei Chen;Qingshan Jiang;Shengrui Wang

  • Affiliations:
  • -;-;-

  • Venue:
  • ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
  • Year:
  • 2008

Quantified Score

Hi-index 0.01

Visualization

Abstract

Clustering high dimensional data is a big challenge in data mining due to the curse of dimensionality. To solve this problem, projective clustering has been defined as an extension of traditional clustering that seeks to find projected clusters in subsets of dimensions of a data space. In this paper, the problem of modeling projected clusters is first discussed, and an extended Gaussian model is proposed. Second, a general objective criterion used with $k$-means type projective clustering is presented based on the model. Finally, the expressions to learn model parameters are derived and then used in a new algorithm named FPC to perform fuzzy clustering on high dimensional data. The experimental results on document clustering show the effectiveness of the proposed clustering model.