Accelerating spectral clustering with partial supervision

  • Authors:
  • Dimitrios Mavroeidis

  • Affiliations:
  • European Commission, Joint Research Centre, Ispra, Italy

  • Venue:
  • Data Mining and Knowledge Discovery
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Spectral Clustering is a popular learning paradigm that employs the eigenvectors and eigenvalues of an appropriate input matrix for approximating the clustering objective. Albeit its empirical success in diverse application areas, spectral clustering has been criticized for its inefficiency when dealing with large-size datasets. This is mainly due to the fact that the complexity of most eigenvector algorithms is cubic with respect to the number of instances and even memory efficient iterative eigensolvers (such as the Power Method) may converge very slowly to the desired eigenvector solutions. In this paper, inspired from the relevant work on Pagerank we propose a semi-supervised framework for spectral clustering that provably improves the efficiency of the Power Method for computing the Spectral Clustering solution. The proposed method is extremely suitable for large and sparse matrices, where it is demonstrated to converge to the eigenvector solution with just a few Power Method iterations. The proposed framework reveals a novel perspective of semi-supervised spectral methods and demonstrates that the efficiency of spectral clustering can be enhanced not only by data compression but also by introducing the appropriate supervised bias to the input Laplacian matrix. Apart from the efficiency gains, the proposed framework is also demonstrated to improve the quality of the derived cluster models.