Multiple self-splitting and merging competitive learning algorithm

Authors:
Jun Liu;Kotagiri Ramamohanarao
Affiliations:
Department of Computer Science and Software Engineering, The University of Melbourne, Victoria, Australia;Department of Computer Science and Software Engineering, The University of Melbourne, Victoria, Australia
Venue:
PAKDD'07 Proceedings of the 11th Pacific-Asia conference on Advances in knowledge discovery and data mining
Year:
2007

Citing 4
Cited 0

X-means: Extending K-means with Efficient Estimation of the Number of Clusters

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Review:

Neural Computation
Cluster analysis of gene expression data based on self-splitting and merging competitive learning

IEEE Transactions on Information Technology in Biomedicine
Self-splitting competitive learning: a new on-line clustering paradigm

IEEE Transactions on Neural Networks

Quantified Score

Hi-index	0.00

Visualization

Abstract

The Self-Splitting Competitive Learning (SSCL) is a powerful algorithm that solves the difficult problems of determining the number of clusters and the sensitivity to prototype initialization in clustering. The SSCL algorithm iteratively partitions the data space into natural clusters without a priori information on the number of clusters. It starts with only a single prototype and adaptively splits it into multiple prototypes during the learning process based on a split-validity measure. It is able to discover all natural groups; each is associated with a prototype. However, one major problem of SSCL is the slow speed of learning process, because only one prototype can split each time. In this paper, we introduce multiple splitting scheme to accelerate the learning process and incorporates prototypes merging. Besides of these, Bayesian Information Criterion (BIC) score is used to evaluate the clusters. Experiments show that these techniques make the algorithm 5 times faster than SSCL on large data set with high dimensions and achieve better quality of clustering.