Discriminative cluster analysis

  • Authors:
  • Fernando De la Torre;Takeo Kanade

  • Affiliations:
  • Carnegie Mellon University, Pittsburgh, PA;Carnegie Mellon University, Pittsburgh, PA

  • Venue:
  • ICML '06 Proceedings of the 23rd international conference on Machine learning
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Clustering is one of the most widely used statistical tools for data analysis. Among all existing clustering techniques, k-means is a very popular method because of its ease of programming and because it accomplishes a good trade-off between achieved performance and computational complexity. However, k-means is prone to local minima problems, and it does not scale too well with high dimensional data sets. A common approach to dealing with high dimensional data is to cluster in the space spanned by the principal components (PC). In this paper, we show the benefits of clustering in a low dimensional discriminative space rather than in the PC space (generative). In particular, we propose a new clustering algorithm called Discriminative Cluster Analysis (DCA). DCA jointly performs dimensionality reduction and clustering. Several toy and real examples show the benefits of DCA versus traditional PCA+k-means clustering. Additionally, a new matrix formulation is proposed and connections with related techniques such as spectral graph methods and linear discriminant analysis are provided.