Acceleration of DBSCAN-based clustering with reduced neighborhood evaluations

Authors:
Andreas Thom;Oliver Kramer
Affiliations:
Department of Computer Science, Technische Universität Dortmund, Dortmund, Germany;International Computer Science Institute, Berkeley, CA
Venue:
KI'10 Proceedings of the 33rd annual German conference on Advances in artificial intelligence
Year:
2010

Citing 3
Cited 0

Approaches for scaling DBSCAN algorithm to large spatial databases

Journal of Computer Science and Technology
R-trees: a dynamic index structure for spatial searching

SIGMOD '84 Proceedings of the 1984 ACM SIGMOD international conference on Management of data
Fast evolutionary maximum margin clustering

ICML '09 Proceedings of the 26th Annual International Conference on Machine Learning

Quantified Score

Hi-index	0.00

Visualization

Abstract

DBSCAN is a density-based clustering technique, well appropriate to discover clusters of arbitrary shape, and to handle noise. The number of clusters does not have to be known in advance. Its performance is limited by calculating the ε-neighborhood of each point of the data set. Besides methods that reduce the query complexity of nearest neighbor search, other approaches concentrate on the reduction of necessary ε-neighborhood evaluations. In this paper we propose a heuristic that selects a reduced number of points for the nearest neighborhood search, and uses efficient data structures and algorithms to reduce the runtime significantly. Unlike previous approaches, the number of necessary evaluations is independent of the data space dimensionality. We evaluate the performance of the new approach experimentally on artificial test cases and problems from the UCI machine learning repository.