A hierarchical information theoretic technique for the discovery of non linear alternative clusterings

Authors:
Xuan-Hong Dang;James Bailey
Affiliations:
The University of Melbourne, Melbourne, Australia;The University of Melbourne, Melbourne, Australia
Venue:
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Year:
2010

Citing 10
Cited 5

Constrained K-means Clustering with Background Knowledge

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Clustering with Instance-level Constraints

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
Integrating constraints and metric learning in semi-supervised clustering

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Non-redundant clustering

Non-redundant clustering
COALA: A Novel Approach for the Extraction of an Alternate Clustering of High Quality and High Dissimilarity

ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Meta Clustering

ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Non-redundant Multi-view Clustering via Orthogonalization

ICDM '07 Proceedings of the 2007 Seventh IEEE International Conference on Data Mining
Cluster Ensemble Selection

Statistical Analysis and Data Mining
Finding Alternative Clusterings Using Constraints

ICDM '08 Proceedings of the 2008 Eighth IEEE International Conference on Data Mining
Multiobjective data clustering

CVPR'04 Proceedings of the 2004 IEEE computer society conference on Computer vision and pattern recognition

An architecture for component-based design of representative-based clustering algorithms

Data & Knowledge Engineering
A novel approach for finding alternative clusterings using feature selection

DASFAA'12 Proceedings of the 17th international conference on Database Systems for Advanced Applications - Volume Part I
Multi-view clustering using mixture models in subspace projections

Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
How to "alternatize" a clustering algorithm

Data Mining and Knowledge Discovery
Generating multiple alternative clusterings via globally optimal subspaces

Data Mining and Knowledge Discovery

Quantified Score

Hi-index	0.00

Visualization

Abstract

Discovery of alternative clusterings is an important method for exploring complex datasets. It provides the capability for the user to view clustering behaviour from different perspectives and thus explore new hypotheses. However, current algorithms for alternative clustering have focused mainly on linear scenarios and may not perform as desired for datasets containing clusters with non linear shapes. Our goal in this paper is to address this challenge of non linearity. In particular, we propose a novel algorithm to uncover an alternative clustering that is distinctively different from an existing, reference clustering. Our technique is information theory based and aims to ensure alternative clustering quality by maximizing the mutual information between clustering labels and data observations, whilst at the same time ensuring alternative clustering distinctiveness by minimizing the information sharing between the two clusterings. We perform experiments to assess our method against a large range of alternative clustering algorithms in the literature. We show our technique's performance is generally better for non-linear scenarios and furthermore, is highly competitive even for simpler, linear scenarios.