Fast large-scale spectral clustering by sequential shrinkage optimization

  • Authors:
  • Tie-Yan Liu;Huai-Yuan Yang;Xin Zheng;Tao Qin;Wei-Ying Ma

  • Affiliations:
  • Microsoft Research Asia, Haidian District, Beijing, P.R. China;Peking University, Beijing, P.R. China;Tsinghua University, Beijing, P.R. China;Tsinghua University, Beijing, P.R. China;Microsoft Research Asia, Haidian District, Beijing, P.R. China

  • Venue:
  • ECIR'07 Proceedings of the 29th European conference on IR research
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

In many applications, we need to cluster large-scale data objects. However, some recently proposed clustering algorithms such as spectral clustering can hardly handle large-scale applications due to the complexity issue, although their effectiveness has been demonstrated in previous work. In this paper, we propose a fast solver for spectral clustering. In contrast to traditional spectral clustering algorithms that first solve an eigenvalue decomposition problem, and then employ a clustering heuristic to obtain labels for the data points, our new approach sequentially decides the labels of relatively well-separated data points. Because the scale of the problem shrinks quickly during this process, it can be much faster than the traditional methods. Experiments on both synthetic data and a large collection of product records show that our algorithm can achieve significant improvement in speed as compared to traditional spectral clustering algorithms.