Algorithms for clustering data
Algorithms for clustering data
Pattern Recognition and Neural Networks
Pattern Recognition and Neural Networks
Performance Evaluation of Some Clustering Algorithms and Validity Indices
IEEE Transactions on Pattern Analysis and Machine Intelligence
Multiclassifier Systems: Back to the Future
MCS '02 Proceedings of the Third International Workshop on Multiple Classifier Systems
The Journal of Machine Learning Research
Cluster ensembles --- a knowledge reuse framework for combining multiple partitions
The Journal of Machine Learning Research
Bagging for Path-Based Clustering
IEEE Transactions on Pattern Analysis and Machine Intelligence
Combining Multiple Weak Clusterings
ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Pattern Classification (2nd Edition)
Pattern Classification (2nd Edition)
Cluster ensemble and its applications in gene expression analysis
APBC '04 Proceedings of the second conference on Asia-Pacific bioinformatics - Volume 29
Ensembles of Partitions via Data Resampling
ITCC '04 Proceedings of the International Conference on Information Technology: Coding and Computing (ITCC'04) Volume 2 - Volume 2
Stability-based validation of clustering solutions
Neural Computation
Stability of Randomized Learning Algorithms
The Journal of Machine Learning Research
Combining Multiple Clusterings Using Evidence Accumulation
IEEE Transactions on Pattern Analysis and Machine Intelligence
Resampling Method for Unsupervised Estimation of Cluster Validity
Neural Computation
ROC curves and video analysis optimization in intestinal capsule endoscopy
Pattern Recognition Letters - Special issue: ROC analysis in pattern recognition
Moderate diversity for better cluster ensembles
Information Fusion
Finding natural clusters using multi-clusterer combiner based on shared nearest neighbors
MCS'03 Proceedings of the 4th international conference on Multiple classifier systems
Journal on Image and Video Processing - Color in Image and Video Processing
Kernel k-Means Clustering Applied to Vector Space Embeddings of Graphs
ANNPR '08 Proceedings of the 3rd IAPR workshop on Artificial Neural Networks in Pattern Recognition
Using Global Optimization to Explore Multiple Solutions of Clustering Problems
KES '08 Proceedings of the 12th international conference on Knowledge-Based Intelligent Information and Engineering Systems, Part III
Robust Clustering by Aggregation and Intersection Methods
KES '08 Proceedings of the 12th international conference on Knowledge-Based Intelligent Information and Engineering Systems, Part III
Refining Pairwise Similarity Matrix for Cluster Ensemble Problem with Cluster Relations
DS '08 Proceedings of the 11th International Conference on Discovery Science
Unsupervised Video Shot Segmentation Using Global Color and Texture Information
ISVC '08 Proceedings of the 4th International Symposium on Advances in Visual Computing
Fuzzy ensemble clustering based on random projections for DNA microarray data analysis
Artificial Intelligence in Medicine
Data dependency in multiple classifier systems
Pattern Recognition
Improving clustering stability with combinatorial MRFs
Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Interactive Visualization Tools for Meta-Clustering
Proceedings of the 2009 conference on New Directions in Neural Networks: 18th Italian Workshop on Neural Networks: WIRN 2008
An Experimental Validation of Some Indexes of Fuzzy Clustering Similarity
WILF '09 Proceedings of the 8th International Workshop on Fuzzy Logic and Applications
Metaclustering and Consensus Algorithms for Interactive Data Analysis and Validation
WILF '09 Proceedings of the 8th International Workshop on Fuzzy Logic and Applications
When Semi-supervised Learning Meets Ensemble Learning
MCS '09 Proceedings of the 8th International Workshop on Multiple Classifier Systems
Stability and Performances in Biclustering Algorithms
Computational Intelligence Methods for Bioinformatics and Biostatistics
Cluster-based genetic segmentation of time series with DWT
Pattern Recognition Letters
Iterative Bayesian fuzzy clustering toward flexible icon-based assistive software for the disabled
Information Sciences: an International Journal
Comparing hard and fuzzy c-means for evidence-accumulation clustering
FUZZ-IEEE'09 Proceedings of the 18th international conference on Fuzzy Systems
Application notes: data mining in cancer research
IEEE Computational Intelligence Magazine
Robust clustering using discriminant analysis
ICDM'10 Proceedings of the 10th industrial conference on Advances in data mining: applications and theoretical aspects
Nonparametric Bayesian clustering ensembles
ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part III
Greedy optimization classifiers ensemble based on diversity
Pattern Recognition
Tuning graded possibilistic clustering by visual stability analysis
WILF'11 Proceedings of the 9th international conference on Fuzzy logic and applications
Generalized Adjusted Rand Indices for cluster ensembles
Pattern Recognition
From cluster ensemble to structure ensemble
Information Sciences: an International Journal
A New Unsupervised Feature Ranking Method for Gene Expression Data Based on Consensus Affinity
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
TaPP'13 Proceedings of the 5th USENIX conference on Theory and Practice of Provenance
Proceedings of the 5th USENIX Workshop on the Theory and Practice of Provenance
Journal of Information Science
Pairwise similarity for cluster ensemble problem: link-based and approximate approaches
Transactions on Large-Scale Data- and Knowledge-centered systems IX
Hi-index | 0.15 |
Many clustering algorithms, including cluster ensembles, rely on a random component. Stability of the results across different runs is considered to be an asset of the algorithm. The cluster ensembles considered here are based on k-means clusterers. Each clusterer is assigned a random target number of clusters, k and is started from a random initialization. Here, we use 10 artificial and 10 real data sets to study ensemble stability with respect to random k, and random initialization. The data sets were chosen to have a small number of clusters (two to seven) and a moderate number of data points (up to a few hundred). Pairwise stability is defined as the adjusted Rand index between pairs of clusterers in the ensemble, averaged across all pairs. Nonpairwise stability is defined as the entropy of the consensus matrix of the ensemble. An experimental comparison with the stability of the standard k-means algorithm was carried out for k from 2 to 20. The results revealed that ensembles are generally more stable, markedly so for larger k. To establish whether stability can serve as a cluster validity index, we first looked at the relationship between stability and accuracy with respect to the number of clusters, k. We found that such a relationship strongly depends on the data set, varying from almost perfect positive correlation (0.97, for the glass data) to almost perfect negative correlation (-0.93, for the crabs data). We propose a new combined stability index to be the sum of the pairwise individual and ensemble stabilities. This index was found to correlate better with the ensemble accuracy. Following the hypothesis that a point of stability of a clustering algorithm corresponds to a structure found in the data, we used the stability measures to pick the number of clusters. The combined stability index gave best results.