Cumulative Voting Consensus Method for Partitions with Variable Number of Clusters

Authors:
Hanan G. Ayad;Mohamed S. Kamel
Affiliations:
-;-
Venue:
IEEE Transactions on Pattern Analysis and Machine Intelligence
Year:
2008

Citing 27
Cited 29

Algorithms for clustering data

Algorithms for clustering data
Elements of information theory

Elements of information theory
Bagging predictors

Machine Learning
A decision-theoretic generalization of on-line learning and an application to boosting

Journal of Computer and System Sciences - Special issue: 26th annual ACM symposium on the theory of computing & STOC'94, May 23–25, 1994, and second annual Europe an conference on computational learning theory (EuroCOLT'95), March 13–15, 1995
The Random Subspace Method for Constructing Decision Forests

IEEE Transactions on Pattern Analysis and Machine Intelligence
Data clustering: a review

ACM Computing Surveys (CSUR)
Reinterpreting the Category Utility Function

Machine Learning
Random Forests

Machine Learning
Evidence Accumulation Clustering Based on the K-Means Algorithm

Proceedings of the Joint IAPR International Workshop on Structural, Syntactic, and Statistical Pattern Recognition
Multiclassifier Systems: Back to the Future

MCS '02 Proceedings of the Third International Workshop on Multiple Classifier Systems
Information-theoretical methods in clustering

Information-theoretical methods in clustering
Cluster ensembles --- a knowledge reuse framework for combining multiple partitions

The Journal of Machine Learning Research
A divisive information theoretic feature clustering algorithm for text classification

The Journal of Machine Learning Research
Bagging for Path-Based Clustering

IEEE Transactions on Pattern Analysis and Machine Intelligence
Combining Multiple Weak Clusterings

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Privacy-preserving Distributed Clustering using Generative Models

ICDM '03 Proceedings of the Third IEEE International Conference on Data Mining
Pattern Classification (2nd Edition)

Pattern Classification (2nd Edition)
Axiomatic Concensus Theory in Group Choice and Biomathematics (Frontiers in Applied Mathematics, 29)

Axiomatic Concensus Theory in Group Choice and Biomathematics (Frontiers in Applied Mathematics, 29)
Combining Pattern Classifiers: Methods and Algorithms

Combining Pattern Classifiers: Methods and Algorithms
Ensembles of Partitions via Data Resampling

ITCC '04 Proceedings of the International Conference on Information Technology: Coding and Computing (ITCC'04) Volume 2 - Volume 2
Analysis of Consensus Partition in Cluster Ensemble

ICDM '04 Proceedings of the Fourth IEEE International Conference on Data Mining
Combining Multiple Clusterings Using Evidence Accumulation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Non-redundant clustering with conditional ensembles

Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining
Clustering Ensembles: Models of Consensus and Weak Partitions

IEEE Transactions on Pattern Analysis and Machine Intelligence
Finding natural clusters using multi-clusterer combiner based on shared nearest neighbors

MCS'03 Proceedings of the 4th international conference on Multiple classifier systems
Cluster-Based cumulative ensembles

MCS'05 Proceedings of the 6th international conference on Multiple Classifier Systems
Divergence measures based on the Shannon entropy

IEEE Transactions on Information Theory

Edge Detection from Global and Local Views Using an Ensemble of Multiple Edge Detectors

ISVC '08 Proceedings of the 4th International Symposium on Advances in Visual Computing, Part II
Enhanced bisecting k-means clustering using intermediate cooperation

Pattern Recognition
An Evidence Accumulation Approach to Constrained Clustering Combination

MLDM '09 Proceedings of the 6th International Conference on Machine Learning and Data Mining in Pattern Recognition
CBIR of spine X-ray images on inter-vertebral disc space and shape profiles using feature ranking and voting consensus

Data & Knowledge Engineering
Fragment-based clustering ensembles

Proceedings of the 18th ACM conference on Information and knowledge management
A graph-theoretical clustering method based on two rounds of minimum spanning trees

Pattern Recognition
Collaborative clustering with background knowledge

Data & Knowledge Engineering
On voting-based consensus of cluster ensembles

Pattern Recognition
Cooperative clustering

Pattern Recognition
A novel hierarchical-clustering-combination scheme based on fuzzy-similarity relations

IEEE Transactions on Fuzzy Systems
Estimation of the number of clusters using heterogeneous multiple classifier system

ICANN'11 Proceedings of the 21st international conference on Artificial neural networks - Volume Part II
Comparing clustering and metaclustering algorithms

MLDM'11 Proceedings of the 7th international conference on Machine learning and data mining in pattern recognition
A generative dyadic aspect model for evidence accumulation clustering

SIMBAD'11 Proceedings of the First international conference on Similarity-based pattern recognition
A metric to evaluate a cluster by eliminating effect of complement cluster

KI'11 Proceedings of the 34th Annual German conference on Advances in artificial intelligence
Hybrid cluster ensemble framework based on the random combination of data transformation operators

Pattern Recognition
Positional and confidence voting-based consensus functions for fuzzy cluster ensembles

Fuzzy Sets and Systems
Generalized Adjusted Rand Indices for cluster ensembles

Pattern Recognition
A new asymmetric criterion for cluster validation

CIARP'11 Proceedings of the 16th Iberoamerican Congress conference on Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications
From cluster ensemble to structure ensemble

Information Sciences: an International Journal
Consensus clustering based on constrained self-organizing map and improved Cop-Kmeans ensemble in intelligent decision support systems

Knowledge-Based Systems
A max metric to evaluate a cluster

HAIS'12 Proceedings of the 7th international conference on Hybrid Artificial Intelligent Systems - Volume Part I
Cluster ensembles

Wiley Interdisciplinary Reviews: Data Mining and Knowledge Discovery
A clustering ensemble based on a modified normalized mutual information metric

AMT'12 Proceedings of the 8th international conference on Active Media Technology
Combining multiple clusterings of chemical structures using cumulative voting-based aggregation algorithm

ACIIDS'13 Proceedings of the 5th Asian conference on Intelligent Information and Database Systems - Volume Part II
Adaptive cumulative voting-based aggregation algorithm for combining multiple clusterings of chemical structures

ACIIDS'13 Proceedings of the 5th Asian conference on Intelligent Information and Database Systems - Volume Part II
Cluster ensemble selection based on relative validity indexes

Data Mining and Knowledge Discovery
A theoretic framework of K-means-based consensus clustering

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Agreement-based fuzzy C-means for clustering data with blocks of features

Neurocomputing
Effects of resampling method and adaptation on clustering ensemble efficacy

Artificial Intelligence Review

Quantified Score

Hi-index	0.14

Visualization

Abstract

Over the past few years, there has been a renewed interest in the consensus clustering problem. Several new methods have been proposed for finding a consensus partition for a set of n data objects that optimally summarizes an ensemble. In this paper, we propose new consensus clustering algorithms with linear computational complexity in n. We consider clusterings generated with random number of clusters, which we describe by categorical random variables. We introduce the idea of cumulative voting as a solution for the problem of cluster label alignment, where, unlike the common one-to-one voting scheme, a probabilistic mapping is computed. We seek a first summary of the ensemble that minimizes the average squared distance between the mapped partitions and the optimal representation of the ensemble, where the selection criterion of the reference clustering is defined based on maximizing the information content as measured by the entropy. We describe cumulative vote weighting schemes and corresponding algorithms to compute an empirical probability distribution summarizing the ensemble. Given the arbitrary number of clusters of the input partitions, we formulate the problem of extracting the optimal consensus as that of finding a compressed summary of the estimated distribution that preserves maximum relevant information. An efficient solution is obtained using an agglomerative algorithm that minimizes the average generalized Jensen-Shannon divergence within the cluster. The empirical study demonstrates significant gains in accuracy and superior performance compared to several recent consensus clustering algorithms.