Clustering with relative constraints

Authors:
Eric Yi Liu;Zhaojun Zhang;Wei Wang
Affiliations:
University of North Carolina at Chapel Hill, Chapel Hill, NC, USA;University of North Carolina at Chapel Hill, Chapel Hill, NC, USA;University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Venue:
Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Year:
2011

Citing 22
Cited 2

Regular Article: Extension Operations on Sets of Leaf-Labeled Trees

Advances in Applied Mathematics
Reconstruction of rooted trees from subtrees

Discrete Applied Mathematics
Making large-scale support vector machine learning practical

Advances in kernel methods
A supertree method for rooted trees

Discrete Applied Mathematics
Poly-logarithmic deterministic fully-dynamic algorithms for connectivity, minimum spanning tree, 2-edge, and biconnectivity

Journal of the ACM (JACM)
Constrained K-means Clustering with Background Knowledge

ICML '01 Proceedings of the Eighteenth International Conference on Machine Learning
From Instance-level Constraints to Space-Level Constraints: Making the Most of Prior Knowledge in Data Clustering

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Semi-supervised Clustering by Seeding

ICML '02 Proceedings of the Nineteenth International Conference on Machine Learning
Clustering with Instance-level Constraints

ICML '00 Proceedings of the Seventeenth International Conference on Machine Learning
The DBLP Computer Science Bibliography: Evolution, Research Issues, Perspectives

SPIRE 2002 Proceedings of the 9th International Symposium on String Processing and Information Retrieval
A probabilistic framework for semi-supervised clustering

Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining
Integrating constraints and metric learning in semi-supervised clustering

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Rooted Maximum Agreement Supertrees

Algorithmica
Minimum-Flip Supertrees: Complexity and Algorithms

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Using Max Cut to Enhance Rooted Trees Consistency

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Personalized Hierarchical Clustering

WI '06 Proceedings of the 2006 IEEE/WIC/ACM International Conference on Web Intelligence
Semisupervised Clustering with Metric Learning using Relative Comparisons

IEEE Transactions on Knowledge and Data Engineering
Constrained Clustering: Advances in Algorithms, Theory, and Applications

Constrained Clustering: Advances in Algorithms, Theory, and Applications
New Results on Optimizing Rooted Triplets Consistency

ISAAC '08 Proceedings of the 19th International Symposium on Algorithms and Computation
Using instance-level constraints in agglomerative hierarchical clustering: theoretical and empirical results

Data Mining and Knowledge Discovery
C-DBSCAN: Density-Based Clustering with Constraints

RSFDGrC '07 Proceedings of the 11th International Conference on Rough Sets, Fuzzy Sets, Data Mining and Granular Computing
Measuring constraint-set utility for partitional clustering algorithms

PKDD'06 Proceedings of the 10th European conference on Principle and Practice of Knowledge Discovery in Databases

A constrained frequent pattern mining system for handling aggregate constraints

Proceedings of the 16th International Database Engineering & Applications Sysmposium
Mining evolutionary multi-branch trees from text streams

Proceedings of the 19th ACM SIGKDD international conference on Knowledge discovery and data mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

Recent studies have suggested using relative distance comparisons as constraints to represent domain knowledge. A natural extension to relative comparisons is the combination of two comparisons defined on the same set of three instances. Constraints in this form, termed Relative Constraints, provide a unified knowledge representation for both partitional and hierarchical clusterings. But many key properties of relative constraints remain unknown. In this paper, we answer the following important questions that enable the broader application of relative constraints in general clustering problems: " Feasibility: Does there exist a clustering that satisfies a given set of relative constraints? (consistency of constraints) "Completeness: Given a set of consistent relative constraints, how can one derive a complete clustering without running into dead-ends? " Informativeness: How can one extract the most informative relative constraints from given knowledge sources? We show that any hierarchical domain knowledge can be easily represented by relative constraints. We further present a hierarchical algorithm that finds a clustering satisfying all given constraints in polynomial time. Experiments showed that our algorithm achieves significantly higher accuracy than the existing metric learning approach based on relative comparisons.