Normalized Cuts and Image Segmentation
IEEE Transactions on Pattern Analysis and Machine Intelligence
Cluster ensembles --- a knowledge reuse framework for combining multiple partitions
The Journal of Machine Learning Research
Multiclass Spectral Clustering
ICCV '03 Proceedings of the Ninth IEEE International Conference on Computer Vision - Volume 2
Combining Multiple Clusterings Using Evidence Accumulation
IEEE Transactions on Pattern Analysis and Machine Intelligence
Clustering Ensembles: Models of Consensus and Weak Partitions
IEEE Transactions on Pattern Analysis and Machine Intelligence
ACM Transactions on Knowledge Discovery from Data (TKDD)
CONSENSUS-BASED ENSEMBLES OF SOFT CLUSTERINGS
Applied Artificial Intelligence
Automatic dimensionality selection from the scree plot via the use of profile likelihood
Computational Statistics & Data Analysis
Data clustering: 50 years beyond K-means
Pattern Recognition Letters
Hi-index | 0.10 |
Clustering is one of the most important unsupervised learning problems and it consists of finding a common structure in a collection of unlabeled data. However, due to the ill-posed nature of the problem, different runs of the same clustering algorithm applied to the same data-set usually produce different solutions. In this scenario choosing a single solution is quite arbitrary. On the other hand, in many applications the problem of multiple solutions becomes intractable, hence it is often more desirable to provide a limited group of ''good'' clusterings rather than a single solution. In the present paper we propose the least squares consensus clustering. This technique allows to extrapolate a small number of different clustering solutions from an initial (large) ensemble obtained by applying any clustering algorithm to a given data-set. We also define a measure of quality and present a graphical visualization of each consensus clustering to make immediately interpretable the strength of the consensus. We have carried out several numerical experiments both on synthetic and real data-sets to illustrate the proposed methodology.