Approximate Clustering of Noisy Biomedical Data

  • Authors:
  • Krzysztof Boryczko;Marcin Kurdziel

  • Affiliations:
  • Institute of Computer Science, AGH University of Science and Technology, Kraków, Poland 30---059;Institute of Computer Science, AGH University of Science and Technology, Kraków, Poland 30---059

  • Venue:
  • ICCS '08 Proceedings of the 8th international conference on Computational Science, Part I
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Classical clustering algorithms often perform poorly on data harboring background noise, i.e. large number of observations distributed uniformly in the feature space. Here, we present a new density-based algorithm for approximate clustering of such noisy data. The algorithm employs Shared Nearest Neighbor Graphsfor estimating local data density and identification of core points, which are assumed to indicate locations of clusters. Partitioning of core points into clusters is performed by means of Mutual Nearest Neighbordistance measure. This similarity measure is sensitive to changes in local data density, and is thus useful for discovering clusters that differ in this respect. Performance of the presented algorithm was demonstrated on three data sets, two synthetic and one real world. In all cases, meaningful clustering structures were discovered.