Enhancing minimum spanning tree-based clustering by removing density-based outliers

  • Authors:
  • Xiaochun Wang;Xia Li Wang;Cong Chen;D. Mitchell Wilkes

  • Affiliations:
  • Xian Jiaotong University, Peoples Republic of China;Changan University, Peoples Republic of China;Xian Jiaotong University, Peoples Republic of China;Vanderbilt University, USA

  • Venue:
  • Digital Signal Processing
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Traditional minimum spanning tree-based clustering algorithms only make use of information about edges contained in the tree to partition a data set. As a result, with limited information about the structure underlying a data set, these algorithms are vulnerable to outliers. To address this issue, this paper presents a simple while efficient MST-inspired clustering algorithm. It works by finding a local density factor for each data point during the construction of an MST and discarding outliers, i.e., those whose local density factor is larger than a threshold, to increase the separation between clusters. This algorithm is easy to implement, requiring an implementation of iDistance as the only k-nearest neighbor search structure. Experiments performed on both small low-dimensional data sets and large high-dimensional data sets demonstrate the efficacy of our method.