X-means: Extending K-means with Efficient Estimation of the Number of Clusters
ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Neural Computation
Cluster analysis of gene expression data based on self-splitting and merging competitive learning
IEEE Transactions on Information Technology in Biomedicine
Self-splitting competitive learning: a new on-line clustering paradigm
IEEE Transactions on Neural Networks
Hi-index | 0.00 |
The Self-Splitting Competitive Learning (SSCL) is a powerful algorithm that solves the difficult problems of determining the number of clusters and the sensitivity to prototype initialization in clustering. The SSCL algorithm iteratively partitions the data space into natural clusters without a priori information on the number of clusters. It starts with only a single prototype and adaptively splits it into multiple prototypes during the learning process based on a split-validity measure. It is able to discover all natural groups; each is associated with a prototype. However, one major problem of SSCL is the slow speed of learning process, because only one prototype can split each time. In this paper, we introduce multiple splitting scheme to accelerate the learning process and incorporates prototypes merging. Besides of these, Bayesian Information Criterion (BIC) score is used to evaluate the clusters. Experiments show that these techniques make the algorithm 5 times faster than SSCL on large data set with high dimensions and achieve better quality of clustering.