Spectral clustering with eigenvector selection based on entropy ranking

  • Authors:
  • Feng Zhao;Licheng Jiao;Hanqiang Liu;Xinbo Gao;Maoguo Gong

  • Affiliations:
  • Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education of China, Institute of Intelligent Information Processing, Xidian University, Xi'an, PR China;Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education of China, Institute of Intelligent Information Processing, Xidian University, Xi'an, PR China;Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education of China, Institute of Intelligent Information Processing, Xidian University, Xi'an, PR China;Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education of China, Institute of Intelligent Information Processing, Xidian University, Xi'an, PR China;Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education of China, Institute of Intelligent Information Processing, Xidian University, Xi'an, PR China

  • Venue:
  • Neurocomputing
  • Year:
  • 2010

Quantified Score

Hi-index 0.01

Visualization

Abstract

Ng-Jordan-Weiss (NJW) method is one of the most widely used spectral clustering algorithms. For a K clustering problem, this method partitions data using the largest K eigenvectors of the normalized affinity matrix derived from the dataset. It has been demonstrated that the spectral relaxation solution of K-way grouping is located on the subspace of the largest K eigenvectors. However, we find from a lot of experiments that the top K eigenvectors cannot always detect the structure of the data for real pattern recognition problems. So it is necessary to select eigenvectors for spectral clustering. We propose an eigenvector selection method based on entropy ranking for spectral clustering (ESBER). In this method, first all the eigenvectors are ranked according to their importance on clustering, and then a suitable eigenvector combination is obtained from the ranking list. In this paper, we propose two strategies to select eigenvectors in the ranking list of eigenvectors. One is directly adopting the first K eigenvectors in the ranking list. Different to the largest K eigenvectors of NJW method, these K eigenvectors are the most important eigenvectors among all the eigenvectors. The other eigenvector selection strategy is to search a suitable eigenvector combination among the first Km (KmK) eigenvectors in the ranking list. The eigenvector combination obtained by this strategy can reflect the structure of the original data and lead to a satisfying spectral clustering result. Furthermore, we also present computational complexity reduction strategies for ESBER method to deal with large-scale datasets. We have performed experiments on UCI benchmark datasets, MNIST handwritten digits datasets, and Brodatz texture datasets, adopting NJW method for a baseline comparison. The experimental results show that ESBER method is more robust than NJW method. Especially, ESBER method with the latter eigenvector selection strategy can obtain satisfying clustering results in most cases.