Fast semi-supervised clustering with enhanced spectral embedding

  • Authors:
  • L. C. Jiao;Fanhua Shang;Fei Wang;Yuanyuan Liu

  • Affiliations:
  • Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education of China, Xidian University, Mailbox 224, No. 2 South TaiBai Road, Xi'an 710071, China.;Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education of China, Xidian University, Mailbox 224, No. 2 South TaiBai Road, Xi'an 710071, China.;Healthcare Transformation Group, IBM T. J. Watson Research Center at Hawthorne, NY, USA;Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education of China, Xidian University, Mailbox 224, No. 2 South TaiBai Road, Xi'an 710071, China.

  • Venue:
  • Pattern Recognition
  • Year:
  • 2012

Quantified Score

Hi-index 0.01

Visualization

Abstract

In recent years, semi-supervised clustering (SSC) has aroused considerable interests from the machine learning and data mining communities. In this paper we propose a novel SSC approach with enhanced spectral embedding (ESE), which not only considers the geometric structure information contained in data sets, but also can make use of the given side information such as pairwise constraints. Specially, we first construct a symmetry-favored k-NN graph, which is highly robust to noise and outliers, and can reflect the underlying manifold structures of data sets. Then we learn the enhanced spectral embedding towards an ideal data representation as consistent with the given pairwise constraints as possible. Finally, by using the regularization of spectral embedding we formulate learning the new data representation as a semidefinite-quadratic-linear programming (SQLP) problem, which can be efficiently solved. Experimental results on a variety of synthetic and real-world data sets show that our ESE approach outperforms the state-of-the-art SSC algorithms in terms of speed and quality on both vector-based and graph-based clustering.