Data clustering by minimizing disconnectivity

  • Authors:
  • Jong-Seok Lee;Sigurdur Olafsson

  • Affiliations:
  • SAS Institute, 100 SAS Campus Drive, Cary, NC 27513, USA;Department of Industrial and Manufacturing Systems Engineering, Iowa State University, Ames, IA 50011, USA

  • Venue:
  • Information Sciences: an International Journal
  • Year:
  • 2011

Quantified Score

Hi-index 0.07

Visualization

Abstract

Identifying clusters of arbitrary shapes remains a challenge in the field of data clustering. We propose a new measure of cluster quality based on minimizing the penalty of disconnection between objects that would be ideally clustered together. This disconnectivity is based on analysis of nearest neighbors and the principle that an object should be in the same cluster as its nearest neighbors. An algorithm called MinDisconnect is proposed that heuristically minimizes disconnectivity and numerical results are presented that indicate that the new algorithm can effectively identify clusters of complex shapes and is robust in finding clusters of arbitrary shapes.