Meta Clustering

Authors:
Rich Caruana;Mohamed Elhawary;Nam Nguyen;Casey Smith
Affiliations:
Cornell University, USA;Cornell University, USA;Cornel University, USA;Cornell University, USA
Venue:
ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Year:
2006

Citing 0
Cited 25

Pattern-Miner: integrated management and mining over data mining models

Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining
Using Global Optimization to Explore Multiple Solutions of Clustering Problems

KES '08 Proceedings of the 12th international conference on Knowledge-Based Intelligent Information and Engineering Systems, Part III
Robust Clustering by Aggregation and Intersection Methods

KES '08 Proceedings of the 12th international conference on Knowledge-Based Intelligent Information and Engineering Systems, Part III
Monitoring Patterns through an Integrated Management and Mining Tool

ECML PKDD '08 Proceedings of the European conference on Machine Learning and Knowledge Discovery in Databases - Part II
Change analysis in spatial datasets by interestingness comparison

SIGSPATIAL Special
Interactive Visualization Tools for Meta-Clustering

Proceedings of the 2009 conference on New Directions in Neural Networks: 18th Italian Workshop on Neural Networks: WIRN 2008
Metaclustering and Consensus Algorithms for Interactive Data Analysis and Validation

WILF '09 Proceedings of the 8th International Workshop on Fuzzy Logic and Applications
Multiple data structure discovery through global optimisation, meta clustering and consensus methods

International Journal of Knowledge Engineering and Soft Data Paradigms
Clustering web queries

Proceedings of the 18th ACM conference on Information and knowledge management
Global optimization, meta clustering and consensus clustering for class prediction

IJCNN'09 Proceedings of the 2009 international joint conference on Neural Networks
Application notes: data mining in cancer research

IEEE Computational Intelligence Magazine
Towards subjectifying text clustering

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
A hierarchical information theoretic technique for the discovery of non linear alternative clusterings

Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Learning multiple nonredundant clusterings

ACM Transactions on Knowledge Discovery from Data (TKDD)
A clustering comparison measure using density profiles and its application to the discovery of alternate clusterings

Data Mining and Knowledge Discovery
A polygon-based methodology for mining related spatial datasets

Proceedings of the 1st ACM SIGSPATIAL International Workshop on Data Mining for Geoinformatics
Which clustering do you want? inducing your ideal clustering with minimal feedback

Journal of Artificial Intelligence Research
Kernel based K-medoids for clustering data with uncertainty

ADMA'10 Proceedings of the 6th international conference on Advanced data mining and applications: Part I
Designing an ensemble classifier over subspace classifiers using iterative convergence routine

Proceedings of the 20th ACM international conference on Information and knowledge management
Model-based multidimensional clustering of categorical data

Artificial Intelligence
Model-based clustering of high-dimensional data: Variable selection versus facet determination

International Journal of Approximate Reasoning
Projective clustering ensembles

Data Mining and Knowledge Discovery
How to "alternatize" a clustering algorithm

Data Mining and Knowledge Discovery
An efficient and scalable family of algorithms for combining clusterings

Engineering Applications of Artificial Intelligence
Ensembles for unsupervised outlier detection: challenges and research questions a position paper

ACM SIGKDD Explorations Newsletter

Quantified Score

Hi-index	0.00

Visualization

Abstract

Clustering is ill-defined. Unlike supervised learning where labels lead to crisp performance criteria such as accuracy and squared error, clustering quality depends on how the clusters will be used. Devising clustering criteria that capture what users need is difficult. Most clustering algorithms search for optimal clusterings based on a pre-specified clustering criterion. Our approach differs. We search for many alternate clusterings of the data, and then allow users to select the clustering(s) that best fit their needs. Meta clustering first finds a variety of clusterings and then clusters this diverse set of clusterings so that users must only examine a small number of qualitatively different clusterings. We present methods for automatically generating a diverse set of alternate clusterings, as well as methods for grouping clusterings into meta clusters. We evaluate meta clustering on four test problems and two case studies. Surprisingly, clusterings that would be of most interest to users often are not very compact clusterings.