Silhouettes: a graphical aid to the interpretation and validation of cluster analysis
Journal of Computational and Applied Mathematics
A Classification EM algorithm for clustering and two stochastic versions
Computational Statistics & Data Analysis - Special issue on optimization techniques in statistics
Stability-based validation of clustering solutions
Neural Computation
DWT–CEM: an algorithm for scale-temporal clustering in fMRI
Biological Cybernetics
A tutorial on spectral clustering
Statistics and Computing
Spectral clustering with eigenvector selection
Pattern Recognition
Detecting the Number of Clusters in n-Way Probabilistic Clustering
IEEE Transactions on Pattern Analysis and Machine Intelligence
Selection of the number of clusters via the bootstrap method
Computational Statistics & Data Analysis
A sober look at clustering stability
COLT'06 Proceedings of the 19th annual conference on Learning Theory
Hi-index | 0.03 |
An important and yet unsolved problem in unsupervised data clustering is how to determine the number of clusters. The proposed slope statistic is a non-parametric and data driven approach for estimating the number of clusters in a dataset. This technique uses the output of any clustering algorithm and identifies the maximum number of groups that breaks down the structure of the dataset. Intensive Monte Carlo simulation studies show that the slope statistic outperforms (for the considered examples) some popular methods that have been proposed in the literature. Applications in graph clustering, in iris and breast cancer datasets are shown.