Geometric Algorithms for Density-Based Data Clustering

Authors:
Danny Z. Chen;Michiel H. M. Smid;Bin Xu
Affiliations:
-;-;-
Venue:
ESA '02 Proceedings of the 10th Annual European Symposium on Algorithms
Year:
2002

Citing 7
Cited 0

Algorithms for clustering data

Algorithms for clustering data
An optimal algorithm for approximate nearest neighbor searching fixed dimensions

Journal of the ACM (JACM)
Approximate range searching

Computational Geometry: Theory and Applications
Density-Based Clustering in Spatial Databases: The Algorithm GDBSCAN and Its Applications

Data Mining and Knowledge Discovery
Incremental Clustering for Mining in a Data Warehousing Environment

VLDB '98 Proceedings of the 24rd International Conference on Very Large Data Bases
A Fast Algorithm for Density-Based Clustering in Large Database

PAKDD '99 Proceedings of the Third Pacific-Asia Conference on Methodologies for Knowledge Discovery and Data Mining
Dynamic half-space reporting, geometric optimization, and minimum spanning trees

SFCS '92 Proceedings of the 33rd Annual Symposium on Foundations of Computer Science

Quantified Score

Hi-index	0.00

Visualization

Abstract

We present new geometric approximation and exact algorithms for the density-based data clustering problem in d-dimensional space Rd (for any constant integer d 驴 2). Previously known algorithms for this problem are efficient only for uniformly-distributed points. However, these algorithms all run in 驴(n2) time in the worst case, where n is the number of input points. Our approximation algorithm based on the 驴-fuzzy distance function takes O(n log n) time for any given fixed value 驴 0, and our exact algorithms take sub-quadratic time. The running times and output quality of our algorithms do not depend on any particular data distribution. We believe that our fast approximation algorithm is of considerable practical importance, while our sub-quadratic exact algorithms are more of theoretical interest. We implemented our approximation algorithm and the experimental results show that our approximation algorithm is efficient on arbitrary input point sets.