Approaches for scaling DBSCAN algorithm to large spatial databases
Journal of Computer Science and Technology
Density-Based Clustering in Spatial Databases: The Algorithm GDBSCAN and Its Applications
Data Mining and Knowledge Discovery
Squeezer: an efficient algorithm for clustering categorical data
Journal of Computer Science and Technology
ROCK: A Robust Clustering Algorithm for Categorical Attributes
ICDE '99 Proceedings of the 15th International Conference on Data Engineering
Hi-index | 0.00 |
In view of the fact that DBSCAN clustering algorithm can identify the data with arbitrary shape and one-pass clustering algorithm has the quick and efficient feature, this paper proposes a two-stage hybrid clustering algorithm. DBSCAN is improved to process the data with categorical attributes. By combining one-pass clustering algorithm with DBSCAN clustering algorithm, a two-stage hybrid clustering algorithm is presented. In the first stage, one-pass clustering algorithm is used to group the data (we call it the original partition). In the second stage, we merge that partition with improved DBSCAN clustering algorithm so that the final clusters are obtained. The presented clustering algorithm is of nearly linear time complexity, which can be used to process large-scale datasets. The experimental results on real datasets and synthetic datasets show that the two-stage hybrid clustering algorithm can help identify the data with arbitrary shape similar to DBSCAN, the operating efficiency of which is not only superior to DBSCAN, but also effective and practicable.