Clustering Ensembles: Models of Consensus and Weak Partitions

Authors:
Alexander Topchy;Anil K. Jain;William Punch
Affiliations:
IEEE;IEEE;-
Venue:
IEEE Transactions on Pattern Analysis and Machine Intelligence
Year:
2005

Citing 22
Cited 95

Combinatorial optimization: algorithms and complexity

Combinatorial optimization: algorithms and complexity
Latent variable models and factors analysis

Latent variable models and factors analysis
Algorithms for clustering data

Algorithms for clustering data
Multilevel hypergraph partitioning: application in VLSI domain

DAC '97 Proceedings of the 34th annual Design Automation Conference
On the Optimality of the Simple Bayesian Classifier under Zero-One Loss

Machine Learning - Special issue on learning with probabilistic representations
Automatic subspace clustering of high dimensional data for data mining applications

SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
The Random Subspace Method for Constructing Decision Forests

IEEE Transactions on Pattern Analysis and Machine Intelligence
A Fast and High Quality Multilevel Scheme for Partitioning Irregular Graphs

SIAM Journal on Scientific Computing
Data clustering: a review

ACM Computing Surveys (CSUR)
Unsupervised Learning of Finite Mixture Models

IEEE Transactions on Pattern Analysis and Machine Intelligence
Reinterpreting the Category Utility Function

Machine Learning
A Monte Carlo algorithm for fast projective clustering

Proceedings of the 2002 ACM SIGMOD international conference on Management of data
Knowledge Acquisition Via Incremental Conceptual Clustering

Machine Learning
Using Projections to Visually Cluster High-Dimensional Data

Computing in Science and Engineering
Evidence Accumulation Clustering Based on the K-Means Algorithm

Proceedings of the Joint IAPR International Workshop on Structural, Syntactic, and Statistical Pattern Recognition
Path-Based Clustering for Grouping of Smooth Curves and Texture Segmentation

IEEE Transactions on Pattern Analysis and Machine Intelligence
Consensus Clustering: A Resampling-Based Method for Class Discovery and Visualization of Gene Expression Microarray Data

Machine Learning
Support vector clustering

The Journal of Machine Learning Research
Cluster ensembles --- a knowledge reuse framework for combining multiple partitions

The Journal of Machine Learning Research
Ensembles of Partitions via Data Resampling

ITCC '04 Proceedings of the International Conference on Information Technology: Coding and Computing (ITCC'04) Volume 2 - Volume 2
Bagging, boosting, and C4.S

AAAI'96 Proceedings of the thirteenth national conference on Artificial intelligence - Volume 1
Dimensionality reduction using genetic algorithms

IEEE Transactions on Evolutionary Computation

In search of meaning for time series subsequence clustering: matching algorithms based on a new distance measure

CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Editorial: Identity fusion in unsupervised environments

Information Fusion
Multiobjective Optimization in Bioinformatics and Computational Biology

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Cumulative Voting Consensus Method for Partitions with Variable Number of Clusters

IEEE Transactions on Pattern Analysis and Machine Intelligence
Unsupervised video shot detection using clustering ensemble with a color global scale-invariant feature transform descriptor

Journal on Image and Video Processing - Color in Image and Video Processing
Unsupervised feature selection using clustering ensembles and population based incremental learning algorithm

Pattern Recognition
Particle swarm clustering ensemble

Proceedings of the 10th annual conference on Genetic and evolutionary computation
Collaborative clustering with the use of Fuzzy C-Means and its quantification

Fuzzy Sets and Systems
Fuzzy Ensemble Clustering for DNA Microarray Data Analysis

WILF '07 Proceedings of the 7th international workshop on Fuzzy Logic and Applications: Applications of Fuzzy Sets Theory
Comparing Non-parametric Ensemble Methods for Document Clustering

NLDB '08 Proceedings of the 13th international conference on Natural Language and Information Systems: Applications of Natural Language to Information Systems
Boosting for Model-Based Data Clustering

Proceedings of the 30th DAGM symposium on Pattern Recognition
Using Global Optimization to Explore Multiple Solutions of Clustering Problems

KES '08 Proceedings of the 12th international conference on Knowledge-Based Intelligent Information and Engineering Systems, Part III
Robust Clustering by Aggregation and Intersection Methods

KES '08 Proceedings of the 12th international conference on Knowledge-Based Intelligent Information and Engineering Systems, Part III
Weighted Cluster Ensemble Using a Kernel Consensus Function

CIARP '08 Proceedings of the 13th Iberoamerican congress on Pattern Recognition: Progress in Pattern Recognition, Image Analysis and Applications
Weighted cluster ensembles: Methods and analysis

ACM Transactions on Knowledge Discovery from Data (TKDD)
Resampling-based selective clustering ensembles

Pattern Recognition Letters
A new method for hierarchical clustering combination

Intelligent Data Analysis
A scalable framework for cluster ensembles

Pattern Recognition
A multi-prototype clustering algorithm

Pattern Recognition
Fuzzy ensemble clustering based on random projections for DNA microarray data analysis

Artificial Intelligence in Medicine
A genetic algorithm with gene rearrangement for K-means clustering

Pattern Recognition
Heterogeneous source consensus learning via decision propagation and negotiation

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining
Interactive Visualization Tools for Meta-Clustering

Proceedings of the 2009 conference on New Directions in Neural Networks: 18th Italian Workshop on Neural Networks: WIRN 2008
An Evidence Accumulation Approach to Constrained Clustering Combination

MLDM '09 Proceedings of the 6th International Conference on Machine Learning and Data Mining in Pattern Recognition
Initialization of the Neighborhood EM Algorithm for Spatial Clustering

ADMA '09 Proceedings of the 5th International Conference on Advanced Data Mining and Applications
Multi-Optimisation Consensus Clustering

IDA '09 Proceedings of the 8th International Symposium on Intelligent Data Analysis: Advances in Intelligent Data Analysis VIII
From comparing clusterings to combining clusterings

AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
Multiple data structure discovery through global optimisation, meta clustering and consensus methods

International Journal of Knowledge Engineering and Soft Data Paradigms
A multifaceted perspective at data analysis: a study in collaborative intelligent agents

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics - Special issue on cybernetics and cognitive informatics
A graph-theoretical clustering method based on two rounds of minimum spanning trees

Pattern Recognition
Collaborative clustering with background knowledge

Data & Knowledge Engineering
Evolutionary multi-objective clustering for overlapping clusters detection

CEC'09 Proceedings of the Eleventh conference on Congress on Evolutionary Computation
Global optimization, meta clustering and consensus clustering for class prediction

IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
Comparing hard and fuzzy c-means for evidence-accumulation clustering

FUZZ-IEEE'09 Proceedings of the 18th international conference on Fuzzy Systems
On voting-based consensus of cluster ensembles

Pattern Recognition
Bagging Constraint Score for feature selection with pairwise constraints

Pattern Recognition
An incremental nested partition method for data clustering

Pattern Recognition
Clustering ensembles based on normalized edges

PAKDD'07 Proceedings of the 11th Pacific-Asia conference on Advances in knowledge discovery and data mining
A novel hierarchical-clustering-combination scheme based on fuzzy-similarity relations

IEEE Transactions on Fuzzy Systems
Ensemble learning based distributed clustering

PAKDD'07 Proceedings of the 2007 international conference on Emerging technologies in knowledge discovery and data mining
Collaborative architectures of fuzzy modeling

WCCI'08 Proceedings of the 2008 IEEE world conference on Computational intelligence: research frontiers
Consensus clustering using spectral theory

ICONIP'08 Proceedings of the 15th international conference on Advances in neuro-information processing - Volume Part I
Optimal meta search results clustering

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Automatic malware categorization using cluster ensemble

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
A clustering comparison measure using density profiles and its application to the discovery of alternate clusterings

Data Mining and Knowledge Discovery
Clustering ensembles and space discretization - A new regard toward diversity and consensus

Pattern Recognition Letters
Clustering dictionary definitions using Amazon Mechanical Turk

CSLDAMT '10 Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk
Visual cube and on-line analytical processing of images

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Ensemble clustering in the belief functions framework

International Journal of Approximate Reasoning
Nonparametric Bayesian clustering ensembles

ECML PKDD'10 Proceedings of the 2010 European conference on Machine learning and knowledge discovery in databases: Part III
The effect of cooling functions on ensemble clustering using simulated annealing

Intelligent Data Analysis
Letters: Inducing multi-objective clustering ensembles with genetic programming

Neurocomputing
Distributed data mining methodology for clustering and classification model

ICAISC'10 Proceedings of the 10th international conference on Artificial intelligence and soft computing: Part I
Nearest-neighbor guided evaluation of data reliability and its applications

IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
PSO driven collaborative clustering: A clustering algorithm for ubiquitous environments

Intelligent Data Analysis - Ubiquitous Knowledge Discovery
Soft spectral clustering ensemble applied to image segmentation

Frontiers of Computer Science in China
Machine fusion to enhance the topology preservation of vector quantization artificial neural networks

Pattern Recognition Letters
Image segmentation fusion using general ensemble clustering methods

ACCV'10 Proceedings of the 10th Asian conference on Computer vision - Volume Part IV
A review: accuracy optimization in clustering ensembles using genetic algorithms

Artificial Intelligence Review
Multitask Bregman clustering

Neurocomputing
Bagging-based spectral clustering ensemble selection

Pattern Recognition Letters
Beyond classical consensus clustering: The least squares approach to multiple solutions

Pattern Recognition Letters
Estimation of the number of clusters using heterogeneous multiple classifier system

ICANN'11 Proceedings of the 21st international conference on Artificial neural networks - Volume Part II
Clustering of multiple microarray experiments using information integration

ITBAM'11 Proceedings of the Second international conference on Information technology in bio- and medical informatics
A clustering-ensemble approach based on voting

AICI'11 Proceedings of the Third international conference on Artificial intelligence and computational intelligence - Volume Part I
A generative dyadic aspect model for evidence accumulation clustering

SIMBAD'11 Proceedings of the First international conference on Similarity-based pattern recognition
Combining multiple clusterings using fast simulated annealing

Pattern Recognition Letters
Comparing a clustering density criteria of temporal patterns of terms obtained by different feature sets

RSKT'11 Proceedings of the 6th international conference on Rough sets and knowledge technology
Weighted association based methods for the combination of heterogeneous partitions

Pattern Recognition Letters
Improvements in image categorization using codebook ensembles

Image and Vision Computing
Data clustering: a user’s dilemma

PReMI'05 Proceedings of the First international conference on Pattern Recognition and Machine Intelligence
Hybrid cluster ensemble framework based on the random combination of data transformation operators

Pattern Recognition
Joint cluster based co-clustering for clustering ensembles

ADMA'06 Proceedings of the Second international conference on Advanced Data Mining and Applications
Combining multiple clusterings via k-modes algorithm

ADMA'06 Proceedings of the Second international conference on Advanced Data Mining and Applications
Exploiting the trade-off — the benefits of multiple objectives in data clustering

EMO'05 Proceedings of the Third international conference on Evolutionary Multi-Criterion Optimization
Generalized Adjusted Rand Indices for cluster ensembles

Pattern Recognition
Privileged information for data clustering

Information Sciences: an International Journal
Cluster ensembles via weighted graph regularized nonnegative matrix factorization

ADMA'11 Proceedings of the 7th international conference on Advanced Data Mining and Applications - Volume Part I
From cluster ensemble to structure ensemble

Information Sciences: an International Journal
A New Unsupervised Feature Ranking Method for Gene Expression Data Based on Consensus Affinity

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Consensus clustering based on constrained self-organizing map and improved Cop-Kmeans ensemble in intelligent decision support systems

Knowledge-Based Systems
Multi-objective design of hierarchical consensus functions for clustering ensembles via genetic programming

Decision Support Systems
SOM2CE: double self-organizing map based cluster ensemble framework and its application in cancer gene expression profiles

IEA/AIE'12 Proceedings of the 25th international conference on Industrial Engineering and Other Applications of Applied Intelligent Systems: advanced research in applied artificial intelligence
Semi-supervised clustering ensemble based on multi-ant colonies algorithm

RSKT'12 Proceedings of the 7th international conference on Rough Sets and Knowledge Technology
Projective clustering ensembles

Data Mining and Knowledge Discovery
An enriched game-theoretic framework for multi-objective clustering

Applied Soft Computing
A hierarchical clusterer ensemble method based on boosting theory

Knowledge-Based Systems
An indication of unification for different clustering approaches

Pattern Recognition
Subsampling for efficient and effective unsupervised outlier detection ensembles

Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining
Region-of-interest extraction in low depth of field images using ensemble clustering and difference of Gaussian approaches

Pattern Recognition
New cluster ensemble approach to integrative biological data analysis

International Journal of Data Mining and Bioinformatics
A theoretic framework of K-means-based consensus clustering

IJCAI'13 Proceedings of the Twenty-Third international joint conference on Artificial Intelligence
Ensemble clustering by means of clustering embedding in vector spaces

Pattern Recognition
An ensemble-clustering-based distance metric and its applications

International Journal of Business Intelligence and Data Mining
Ensembles for unsupervised outlier detection: challenges and research questions a position paper

ACM SIGKDD Explorations Newsletter

Quantified Score

Hi-index	0.14

Visualization

Abstract

Clustering ensembles have emerged as a powerful method for improving both the robustness as well as the stability of unsupervised classification solutions. However, finding a consensus clustering from multiple partitions is a difficult problem that can be approached from graph-based, combinatorial, or statistical perspectives. This study extends previous research on clustering ensembles in several respects. First, we introduce a unified representation for multiple clusterings and formulate the corresponding categorical clustering problem. Second, we propose a probabilistic model of consensus using a finite mixture of multinomial distributions in a space of clusterings. A combined partition is found as a solution to the corresponding maximum-likelihood problem using the EM algorithm. Third, we define a new consensus function that is related to the classical intraclass variance criterion using the generalized mutual information definition. Finally, we demonstrate the efficacy of combining partitions generated by weak clustering algorithms that use data projections and random data splits. A simple explanatory model is offered for the behavior of combinations of such weak clustering components. Combination accuracy is analyzed as a function of several parameters that control the power and resolution of component partitions as well as the number of partitions. We also analyze clustering ensembles with incomplete information and the effect of missing cluster labels on the quality of overall consensus. Experimental results demonstrate the effectiveness of the proposed methods on several real-world data sets.