P3C: A Robust Projected Clustering Algorithm

  • Authors:
  • Gabriela Moise;Jorg Sander;Martin Ester

  • Affiliations:
  • University of Alberta, Canada;University of Alberta, Canada;Simon Fraser University, Canada

  • Venue:
  • ICDM '06 Proceedings of the Sixth International Conference on Data Mining
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Projected clustering has emerged as a possible solution to the challenges associated with clustering in high dimensional data. A projected cluster is a subset of points together with a subset of attributes, such that the cluster points project onto a small range of values in each of these attributes, and are uniformly distributed in the remaining attributes. Existing algorithms for projected clustering rely on parameters whose appropriate values are difficult to set by the user, or are unable to identify projected clusters with few relevant attributes. In this paper, we present a robust algorithm for projected clustering that can effectively discover projected clusters in the data while minimizing the number of parameters required as input. In contrast to all previous approaches, our algorithm can discover, under very general conditions, the true number of projected clusters. We show through an extensive experimental evaluation that our algorithm: (1) significantly outperforms existing algorithms for projected clustering in terms of accuracy; (2) is effective in detecting very low-dimensional projected clusters embedded in high dimensional spaces; (3) is effective in detecting clusters with varying orientation in their relevant subspaces; (4) is scalable with respect to large data sets and high number of dimensions.