ACM Computing Surveys (CSUR)
Finding Consistent Clusters in Data Partitions
MCS '01 Proceedings of the Second International Workshop on Multiple Classifier Systems
Cluster ensembles --- a knowledge reuse framework for combining multiple partitions
The Journal of Machine Learning Research
Combining Multiple Clusterings Using Evidence Accumulation
IEEE Transactions on Pattern Analysis and Machine Intelligence
Neighborhood Preserving Embedding
ICCV '05 Proceedings of the Tenth IEEE International Conference on Computer Vision - Volume 2
Pattern Recognition, Third Edition
Pattern Recognition, Third Edition
A Nonlinear Mapping for Data Structure Analysis
IEEE Transactions on Computers
Nonlinear Dimensionality Reduction
Nonlinear Dimensionality Reduction
Cluster-Based cumulative ensembles
MCS'05 Proceedings of the 6th international conference on Multiple Classifier Systems
Curvilinear component analysis: a self-organizing neural network for nonlinear mapping of data sets
IEEE Transactions on Neural Networks
Hi-index | 0.00 |
In this paper we address a voting mechanism to combine clustering ensembles leading to the so-called co-association matrix, under the Evidence Accumulation Clustering framework. Different clustering techniques can be applied to this matrix to obtain the combined data partition, and different clustering strategies may yield too different combination results.We propose to apply embedding methods over this matrix, in an attempt to reduce the sensitivity of the final partition to the clustering method, and still obtain competitive and consistent results. We present a study of several embedding methods over this matrix, interpreting it in two ways: (i) as a feature space and (ii) as a similarity space. In the first case we reduce the dimensionality of the feature space; in the second case we obtain a representation constrained to the similarity matrix. When applying several clustering techniques over these new representations, we evaluate the impact of these transformations in terms of performance and coherence of the obtained data partition. Experimental results, on synthetic and real benchmark datasets, show that extracting the relevant features through dimensionality reduction yields more consistent results than applying the clustering algorithms directly to the co-association matrix.