ROCK: a robust clustering algorithm for categorical attributes
Information Systems
Rank aggregation methods for the Web
Proceedings of the 10th international conference on World Wide Web
Model selection for probabilistic clustering using cross-validatedlikelihood
Statistics and Computing
SODA '03 Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms
FOCS '02 Proceedings of the 43rd Symposium on Foundations of Computer Science
Cluster ensembles --- a knowledge reuse framework for combining multiple partitions
The Journal of Machine Learning Research
Clustering with Qualitative Information
FOCS '03 Proceedings of the 44th Annual IEEE Symposium on Foundations of Computer Science
Integrating Microarray Data by Consensus Clustering
ICTAI '03 Proceedings of the 15th IEEE International Conference on Tools with Artificial Intelligence
Correlation Clustering: maximizing agreements via semidefinite programming
SODA '04 Proceedings of the fifteenth annual ACM-SIAM symposium on Discrete algorithms
Combining multiple clustering systems
PKDD '04 Proceedings of the 8th European Conference on Principles and Practice of Knowledge Discovery in Databases
Aggregating inconsistent information: ranking and clustering
Proceedings of the thirty-seventh annual ACM symposium on Theory of computing
Fitting tree metrics: Hierarchical clustering and Phylogeny
FOCS '05 Proceedings of the 46th Annual IEEE Symposium on Foundations of Computer Science
Merging Interface Schemas on the Deep Web via Clustering Aggregation
ICDM '05 Proceedings of the Fifth IEEE International Conference on Data Mining
Comparing Subspace Clusterings
IEEE Transactions on Knowledge and Data Engineering
Proceedings of the twenty-fifth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
Aggregation of partial rankings, p-ratings and top-m lists
SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
k-ANMI: A mutual information based clustering algorithm for categorical data
Information Fusion
On the Approximation of Correlation Clustering and Consensus Clustering
Journal of Computer and System Sciences
Discovering topical structures of databases
Proceedings of the 2008 ACM SIGMOD international conference on Management of data
Multisource images analysis using collaborative clustering
EURASIP Journal on Advances in Signal Processing
Aggregating inconsistent information: Ranking and clustering
Journal of the ACM (JACM)
Comparing Non-parametric Ensemble Methods for Document Clustering
NLDB '08 Proceedings of the 13th international conference on Natural Language and Information Systems: Applications of Natural Language to Information Systems
Refining Pairwise Similarity Matrix for Cluster Ensemble Problem with Cluster Relations
DS '08 Proceedings of the 11th International Conference on Discovery Science
A new method for hierarchical clustering combination
Intelligent Data Analysis
A scalable framework for cluster ensembles
Pattern Recognition
Change analysis in spatial datasets by interestingness comparison
SIGSPATIAL Special
Exploiting context analysis for combining multiple entity resolution systems
Proceedings of the 2009 ACM SIGMOD International Conference on Management of data
A Multiple Expert Approach to the Class Imbalance Problem Using Inverse Random under Sampling
MCS '09 Proceedings of the 8th International Workshop on Multiple Classifier Systems
Correlation Clustering Revisited: The "True" Cost of Error Minimization Problems
ICALP '09 Proceedings of the 36th International Colloquium on Automata, Languages and Programming: Part I
How to Control Clustering Results? Flexible Clustering Aggregation
IDA '09 Proceedings of the 8th International Symposium on Intelligent Data Analysis: Advances in Intelligent Data Analysis VIII
Music clustering with features from different information sources
IEEE Transactions on Multimedia - Special section on communities and media computing
A novel hierarchical-clustering-combination scheme based on fuzzy-similarity relations
IEEE Transactions on Fuzzy Systems
Towards a general framework for data mining
KDID'06 Proceedings of the 5th international conference on Knowledge discovery in inductive databases
Automatic malware categorization using cluster ensemble
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
On combining multiple clusterings: an overview and a new perspective
Applied Intelligence
A polygon-based methodology for mining related spatial datasets
Proceedings of the 1st ACM SIGSPATIAL International Workshop on Data Mining for Geoinformatics
Visual decision support for ensemble clustering
SSDBM'10 Proceedings of the 22nd international conference on Scientific and statistical database management
Nearest-neighbor guided evaluation of data reliability and its applications
IEEE Transactions on Systems, Man, and Cybernetics, Part B: Cybernetics
Towards a more discriminative and semantic visual vocabulary
Computer Vision and Image Understanding
CLAP: Collaborative pattern mining for distributed information systems
Decision Support Systems
Visualizing transactional data with multiple clusterings for knowledge discovery
ISMIS'06 Proceedings of the 16th international conference on Foundations of Intelligent Systems
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Correlation clustering with stochastic labellings
SIMBAD'13 Proceedings of the Second international conference on Similarity-Based Pattern Recognition
Pairwise similarity for cluster ensemble problem: link-based and approximate approaches
Transactions on Large-Scale Data- and Knowledge-centered systems IX
Hi-index | 0.00 |
We consider the following problem: given a set of clusterings, find a clustering that agrees as much as possible with the given clusterings. This problem, clustering aggregation, appears naturally in various contexts. For example, clustering categorical data is an instance of the problem: each categorical variable can be viewed as a clustering of the input rows. Moreover, clustering aggregation can be used as a meta-clustering method to improve the robustness of clusterings. The problem formulation does not require a-priori information about the number of clusters, and it gives a naturalway for handlingmissing values. We give a formal statement of the clustering-aggregation problem, we discuss related work, and we suggest a number of algorithms. For several of the methods we provide theoretical guarantees on the quality of the solutions. We also show how sampling can be used to scale the algorithms for large data sets. We give an extensive empirical evaluation demonstrating the usefulness of the problem and of the solutions.