A graph-theoretical clustering method based on two rounds of minimum spanning trees

  • Authors:
  • Caiming Zhong;Duoqian Miao;Ruizhi Wang

  • Affiliations:
  • Department of Computer Science and Technology, Tongji University, Shanghai 201804, PR China and Key Laboratory of Embedded System & Service Computing, Ministry of Education of China, Shanghai 2018 ...;Department of Computer Science and Technology, Tongji University, Shanghai 201804, PR China and Key Laboratory of Embedded System & Service Computing, Ministry of Education of China, Shanghai 2018 ...;Department of Computer Science and Technology, Tongji University, Shanghai 201804, PR China and Key Laboratory of Embedded System & Service Computing, Ministry of Education of China, Shanghai 2018 ...

  • Venue:
  • Pattern Recognition
  • Year:
  • 2010

Quantified Score

Hi-index 0.01

Visualization

Abstract

Many clustering approaches have been proposed in the literature, but most of them are vulnerable to the different cluster sizes, shapes and densities. In this paper, we present a graph-theoretical clustering method which is robust to the difference. Based on the graph composed of two rounds of minimum spanning trees (MST), the proposed method (2-MSTClus) classifies cluster problems into two groups, i.e. separated cluster problems and touching cluster problems, and identifies the two groups of cluster problems automatically. It contains two clustering algorithms which deal with separated clusters and touching clusters in two phases, respectively. In the first phase, two round minimum spanning trees are employed to construct a graph and detect separated clusters which cover distance separated and density separated clusters. In the second phase, touching clusters, which are subgroups produced in the first phase, can be partitioned by comparing cuts, respectively, on the two round minimum spanning trees. The proposed method is robust to the varied cluster sizes, shapes and densities, and can discover the number of clusters. Experimental results on synthetic and real datasets demonstrate the performance of the proposed method.