Sparse kernel spectral clustering models for large-scale data analysis

  • Authors:
  • Carlos Alzate;Johan A. K. Suykens

  • Affiliations:
  • Katholieke Universiteit Leuven, Department of Electrical Engineering ESAT-SCD-SISTA, Kasteelpark Arenberg 10, B-3001 Leuven, Belgium;Katholieke Universiteit Leuven, Department of Electrical Engineering ESAT-SCD-SISTA, Kasteelpark Arenberg 10, B-3001 Leuven, Belgium

  • Venue:
  • Neurocomputing
  • Year:
  • 2011

Quantified Score

Hi-index 0.01

Visualization

Abstract

Kernel spectral clustering has been formulated within a primal-dual optimization setting allowing natural extensions to out-of-sample data together with model selection in a learning framework. This becomes important for predictive purposes and for good generalization capabilities. The clustering model is formulated in the primal in terms of mappings to high-dimensional feature spaces typical of support vector machines and kernel-based methodologies. The dual problem corresponds to an eigenvalue decomposition of a centered Laplacian matrix derived from pairwise similarities within the data. The out-of-sample extension can also be used to introduce sparsity and to reduce the computational complexity of the resulting eigenvalue problem. In this paper, we propose several methods to obtain sparse and highly sparse kernel spectral clustering models. The proposed approaches are based on structural properties of the solutions when the clusters are well formed. Experimental results with difficult toy examples and images show the applicability of the proposed sparse models with predictive capabilities.