Speedup Clustering with Hierarchical Ranking

Authors:
Jianjun Zhou;Joerg Sander
Affiliations:
University of Alberta, Canada;University of Alberta, Canada
Venue:
ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Year:
2006

Citing 0
Cited 1

Finding the Nearest Neighbors in Biological Databases Using Less Distance Computations

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Many clustering algorithms in particular hierarchical clustering algorithms do not scale-up well for large data-sets especially when using an expensive distance function. In this paper, we propose a novel approach to perform approximate clustering with high accuracy. We introduce the concept of a pairwise hierarchical ranking to efficiently determine close neighbors for every data object. Empirical results on synthetic and real-life data show a speedup of up to two orders of magnitude over OPTICS while maintaining a high accuracy and up to one order of magnitude over the previously proposed DATA BUBBLES method, which also tries to speedup OPTICS by trading accuracy for speed.