Adaptive dimension reduction for clustering high dimensional data

  • Authors:
  • Chris Ding;Xiaofeng He;Hongyuan Zha;Horst D. Simon

  • Affiliations:
  • -;-;-;-

  • Venue:
  • ICDM '02 Proceedings of the 2002 IEEE International Conference on Data Mining
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

It is well-known that for high dimensional data clustering, standard algorithms such as EM and the K -meansare often trapped in local minimum. Many initializationmethods were proposed to tackle this problem, but withonly limited success. In this paper we propose newapproach to resolve this problem by repeated dimension reductions such that K-means or EM are performedonly in very low dimensions.Cluster membership is utilized as a bridge between the reduced dimensional sub-space and the original space, providing flexibility andease of implementation. Clustering analysis performedon highly overlapped Gaussians, DNA gene expressionprofiles and internet newsgroups demonstrate the effectiveness of the proposed algorithm.