How to "alternatize" a clustering algorithm
Data Mining and Knowledge Discovery
Generating multiple alternative clusterings via globally optimal subspaces
Data Mining and Knowledge Discovery
Hi-index | 0.00 |
Traditional clustering has focused on creating a single good clustering solution, while modern, high dimensional data can often be interpreted, and hence clustered, in different ways. Alternative clustering aims at creating multiple clustering solutions that are both of high quality and distinctive from each other. Methods for alternative clustering can be divided into objective-function-oriented and data-transformation-oriented approaches. This paper presents a novel information theoretic-based, objective-function-oriented approach to generate alternative clusterings, in either an unsupervised or semi-supervised manner. We employ the conditional entropy measure for quantifying both clustering quality and distinctiveness, resulting in an analytically consistent combined criterion. Our approach employs a computationally efficient nonparametric entropy estimator, which does not impose any assumption on the probability distributions. We propose a partitional clustering algorithm, named minCEntropy, to concurrently optimize both clustering quality and distinctiveness. minCEntropy requires setting only some rather intuitive parameters, and performs competitively with existing methods for alternative clustering.