Algorithms for clustering data
Algorithms for clustering data
BIRCH: an efficient data clustering method for very large databases
SIGMOD '96 Proceedings of the 1996 ACM SIGMOD international conference on Management of data
CURE: an efficient clustering algorithm for large databases
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Statistical Pattern Recognition: A Review
IEEE Transactions on Pattern Analysis and Machine Intelligence
ACM Computing Surveys (CSUR)
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
IEEE Transactions on Knowledge and Data Engineering
Survey of clustering algorithms
IEEE Transactions on Neural Networks
Hi-index | 0.00 |
Existing clustering algorithms, such as single-link clustering, k-means, CURE, and CSM are designed to find clusters based on pre-defined parameters specified by users. These algorithms may be unsuccessful if the choice of parameters is inappropriate with respect to the data set being clustered. Most of these algorithms work very well for compact and hyperspherical clusters. In this paper, a new hybrid clustering algorithm called Self-Partition and Self-Merging (SPSM) is proposed. The SPSM algorithm partitions the input data set into several subclusters in the first phase and, then, removes the noisy data in the second phase. In the third phase, the normal subclusters are continuously merged to form the larger clusters based on the inter-cluster distance and intra-cluster distance criteria. From the experimental results, the SPSM algorithm is very efficient to handle the noisy data set, and to cluster the data sets of arbitrary shapes of different density.