Unifying dependent clustering and disparate clustering for non-homogeneous data

Authors:
M. Shahriar Hossain;Satish Tadepalli;Layne T. Watson;Ian Davidson;Richard F. Helm;Naren Ramakrishnan
Affiliations:
Virginia Tech, Blacksburg, VA, USA;Virginia Tech, Blacksburg, VA, USA;Virginia Tech, Blacksburg, VA, USA;UC Davis, Davis, CA, USA;Virginia Tech, Blacksburg, VA, USA;Virginia Tech, Blacksburg, VA, USA
Venue:
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
Year:
2010

Citing 12
Cited 4

Lancelot: A FORTRAN Package for Large-Scale Nonlinear Optimization (Release A)

Lancelot: A FORTRAN Package for Large-Scale Nonlinear Optimization (Release A)
Constrained K-means Clustering with Background Knowledge

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
Multivariate Information Bottleneck

UAI '01 Proceedings of the 17th Conference in Uncertainty in Artificial Intelligence
Information-theoretic co-clustering

Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Fully automatic cross-associations

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Integrating constraints and metric learning in semi-supervised clustering

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Associative Clustering for Exploring Dependencies between Functional Genomics Data Sets

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Unsupervised learning on k-partite graphs

Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining
COALA: A Novel Approach for the Extraction of an Alternate Clustering of High Quality and High Dissimilarity

ICDM '06 Proceedings of the Sixth International Conference on Data Mining
Clustering with Bregman Divergences

The Journal of Machine Learning Research
Simultaneous Unsupervised Learning of Disparate Clusterings

Statistical Analysis and Data Mining
A principled and flexible framework for finding alternative clusterings

Proceedings of the 15th ACM SIGKDD international conference on Knowledge discovery and data mining

Coordinated clustering algorithms to support charging infrastructure design for electric vehicles

Proceedings of the ACM SIGKDD International Workshop on Urban Computing
How to "alternatize" a clustering algorithm

Data Mining and Knowledge Discovery
Adaptive evolutionary clustering

Data Mining and Knowledge Discovery
Ensembles for unsupervised outlier detection: challenges and research questions a position paper

ACM SIGKDD Explorations Newsletter

Quantified Score

Hi-index	0.00

Visualization

Abstract

Modern data mining settings involve a combination of attribute-valued descriptors over entities as well as specified relationships between these entities. We present an approach to cluster such non-homogeneous datasets by using the relationships to impose either dependent clustering or disparate clustering constraints. Unlike prior work that views constraints as boolean criteria, we present a formulation that allows constraints to be satisfied or violated in a smooth manner. This enables us to achieve dependent clustering and disparate clustering using the same optimization framework by merely maximizing versus minimizing the objective function. We present results on both synthetic data as well as several real-world datasets.