Local information-based fast approximate spectral clustering

  • Authors:
  • Jiangzhong Cao;Pei Chen;Qingyun Dai;Wing-Kuen Ling

  • Affiliations:
  • School of Information Science and Technology, Sun Yat-sen University, Guangzhou 510006, China and School of Information Engineering, Guangdong University of Technology, Guangzhou 510006, China;School of Information Science and Technology, Sun Yat-sen University, Guangzhou 510006, China;School of Information Engineering, Guangdong University of Technology, Guangzhou 510006, China;School of Information Engineering, Guangdong University of Technology, Guangzhou 510006, China

  • Venue:
  • Pattern Recognition Letters
  • Year:
  • 2014

Quantified Score

Hi-index 0.10

Visualization

Abstract

Spectral clustering has become one of the most popular clustering approaches in recent years. However, its high computational complexity prevents its application to large-scale datasets. To address this complexity, approximate spectral clustering methods have been proposed. In these methods, computational costs are reduced by using approximation techniques, such as the Nystrom method, or by constructing a smaller representative dataset on which spectral clustering is performed. However, the computational efficiency of these approximation methods is achieved at the cost of performance degradation. In this paper, we propose an efficient approximate spectral clustering method in which clustering performance is improved by utilizing local information among the data, while the scalability to the large-scale datasets is retained. Specifically, we improve the approximate spectral clustering method in two aspects. First, a sparse affinity graph is adopted to improve the performance of spectral clustering on the small representative dataset. Second, local interpolation is utilized to improve the extension of the clustering result. Experiments are conducted on several real-world datasets, showing that the proposed method is efficient and outperforms the state-of-the-art approximate spectral clustering algorithms.