Randomized algorithms
BIRCH: an efficient data clustering method for very large databases
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
Cure: an efficient clustering algorithm for large databases
Information Systems
Efficient and Effective Clustering Methods for Spatial Data Mining
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Hi-index | 0.00 |
This paper presents a clustering algorithm named HYBRID. HYBRID has two phases: in the first phase, a set of spherical atom-clusters with same size is generated, and in the second phase these atom-clusters are merged into a set of molecule-clusters. In the first phase, an incremental clustering method is applied to generate atom-clusters according to memory resources. In the second phase, using an edge expanding process, HYBRID can discover molecule-clusters with arbitrary size and shape. During the edge expanding process, HYBRID considers not only the distance between two atom-clusters, but also the closeness of their densities. Therefore HYBRID can eliminate the impact of outliers while discovering more isomorphic molecule-clusters. HYBRID has the following advantages: low time and space complexity, no requirement of users' involvement to guide the clustering procedure, handling clusters with arbitrary size and shape, and the powerful ability to eliminate outliers.