minCEntropy: A Novel Information Theoretic Approach for the Generation of Alternative Clusterings

Authors:
Nguyen Xuan Vinh;Julien Epps
Affiliations:
-;-
Venue:
ICDM '10 Proceedings of the 2010 IEEE International Conference on Data Mining
Year:
2010

Citing 0
Cited 2

How to "alternatize" a clustering algorithm

Data Mining and Knowledge Discovery
Generating multiple alternative clusterings via globally optimal subspaces

Data Mining and Knowledge Discovery

Quantified Score

Hi-index	0.00

Visualization

Abstract

Traditional clustering has focused on creating a single good clustering solution, while modern, high dimensional data can often be interpreted, and hence clustered, in different ways. Alternative clustering aims at creating multiple clustering solutions that are both of high quality and distinctive from each other. Methods for alternative clustering can be divided into objective-function-oriented and data-transformation-oriented approaches. This paper presents a novel information theoretic-based, objective-function-oriented approach to generate alternative clusterings, in either an unsupervised or semi-supervised manner. We employ the conditional entropy measure for quantifying both clustering quality and distinctiveness, resulting in an analytically consistent combined criterion. Our approach employs a computationally efficient nonparametric entropy estimator, which does not impose any assumption on the probability distributions. We propose a partitional clustering algorithm, named minCEntropy, to concurrently optimize both clustering quality and distinctiveness. minCEntropy requires setting only some rather intuitive parameters, and performs competitively with existing methods for alternative clustering.