Cascade evaluation of clustering algorithms

Authors:
Laurent Candillier;Isabelle Tellier;Fabien Torre;Olivier Bousquet
Affiliations:
GRAppA, Charles de Gaulle University, Lille 3;GRAppA, Charles de Gaulle University, Lille 3;GRAppA, Charles de Gaulle University, Lille 3;Pertinence, Paris
Venue:
ECML'06 Proceedings of the 17th European conference on Machine Learning
Year:
2006

Citing 12
Cited 3

Original Contribution: Stacked generalization

Neural Networks
C4.5: programs for machine learning

C4.5: programs for machine learning
Bagging predictors

Machine Learning
Error reduction through learning multiple descriptions

Machine Learning
Approximate statistical tests for comparing supervised classification learning algorithms

Neural Computation
Combined 5 × 2 cv F test for comparing supervised classification learning algorithms

Neural Computation
Cascade Generalization

Machine Learning
Ensemble Methods in Machine Learning

MCS '00 Proceedings of the First International Workshop on Multiple Classifier Systems
An introduction to boosting and leveraging

Advanced lectures on machine learning
Large Margin Methods for Structured and Interdependent Output Variables

The Journal of Machine Learning Research
A probabilistic estimation framework for predictive modeling analytics

IBM Systems Journal
SSC: statistical subspace clustering

MLDM'05 Proceedings of the 4th international conference on Machine Learning and Data Mining in Pattern Recognition

Collaborative clustering with background knowledge

Data & Knowledge Engineering
Feature interaction in subspace clustering using the Choquet integral

Pattern Recognition
Stacked trees: a new hybrid visualization method

Proceedings of the International Working Conference on Advanced Visual Interfaces

Quantified Score

Hi-index	0.00

Visualization

Abstract

This paper is about the evaluation of the results of clustering algorithms, and the comparison of such algorithms. We propose a new method based on the enrichment of a set of independent labeled datasets by the results of clustering, and the use of a supervised method to evaluate the interest of adding such new information to the datasets. We thus adapt the cascade generalization [1] paradigm in the case where we combine an unsupervised and a supervised learner. We also consider the case where independent supervised learnings are performed on the different groups of data objects created by the clustering [2]. We then conduct experiments using different supervised algorithms to compare various clustering algorithms. And we thus show that our proposed method exhibits a coherent behavior, pointing out, for example, that the algorithms based on the use of complex probabilistic models outperform algorithms based on the use of simpler models.