Approximate Clustering of Noisy Biomedical Data

Authors:
Krzysztof Boryczko;Marcin Kurdziel
Affiliations:
Institute of Computer Science, AGH University of Science and Technology, Kraków, Poland 30---059;Institute of Computer Science, AGH University of Science and Technology, Kraków, Poland 30---059
Venue:
ICCS '08 Proceedings of the 8th international conference on Computational Science, Part I
Year:
2008

Citing 6
Cited 0

CURE: an efficient clustering algorithm for large databases

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
ROCK: a robust clustering algorithm for categorical attributes

Information Systems
Chameleon: Hierarchical Clustering Using Dynamic Modeling

Computer
CLARANS: A Method for Clustering Objects for Spatial Data Mining

IEEE Transactions on Knowledge and Data Engineering
On the Surprising Behavior of Distance Metrics in High Dimensional Spaces

ICDT '01 Proceedings of the 8th International Conference on Database Theory
Clustering Using a Similarity Measure Based on Shared Near Neighbors

IEEE Transactions on Computers

Quantified Score

Hi-index	0.00

Visualization

Abstract

Classical clustering algorithms often perform poorly on data harboring background noise, i.e. large number of observations distributed uniformly in the feature space. Here, we present a new density-based algorithm for approximate clustering of such noisy data. The algorithm employs Shared Nearest Neighbor Graphsfor estimating local data density and identification of core points, which are assumed to indicate locations of clusters. Partitioning of core points into clusters is performed by means of Mutual Nearest Neighbordistance measure. This similarity measure is sensitive to changes in local data density, and is thus useful for discovering clusters that differ in this respect. Performance of the presented algorithm was demonstrated on three data sets, two synthetic and one real world. In all cases, meaningful clustering structures were discovered.