Clustering with Qualitative Information

Authors:
Moses Charikar;Venkatesan Guruswami;Anthony Wirth
Affiliations:
-;-;-
Venue:
FOCS '03 Proceedings of the 44th Annual IEEE Symposium on Foundations of Computer Science
Year:
2003

Citing 0
Cited 38

Efficient location area planning for personal communication systems

Proceedings of the 9th annual international conference on Mobile computing and networking
Correlation Clustering: maximizing agreements via semidefinite programming

SODA '04 Proceedings of the fifteenth annual ACM-SIAM symposium on Discrete algorithms
Correlation Clustering

Machine Learning
A New Conceptual Clustering Framework

Machine Learning
Clustering Aggregation

ICDE '05 Proceedings of the 21st International Conference on Data Engineering
Quadratic forms on graphs

Proceedings of the thirty-seventh annual ACM symposium on Theory of computing
Aggregating inconsistent information: ranking and clustering

Proceedings of the thirty-seventh annual ACM symposium on Theory of computing
A divide-and-merge methodology for clustering

Proceedings of the twenty-fourth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Fitting tree metrics: Hierarchical clustering and Phylogeny

FOCS '05 Proceedings of the 46th Annual IEEE Symposium on Foundations of Computer Science
On the approximability of maximum and minimum edge clique partition problems

CATS '06 Proceedings of the 12th Computing: The Australasian Theroy Symposium - Volume 51
Correlation clustering in general weighted graphs

Theoretical Computer Science - Approximation and online algorithms
A divide-and-merge methodology for clustering

ACM Transactions on Database Systems (TODS)
Clustering aggregation

ACM Transactions on Knowledge Discovery from Data (TKDD)
Efficient location area planning for personal communication systems

IEEE/ACM Transactions on Networking (TON)
The complexity of non-hierarchical clustering with instance and cluster level constraints

Data Mining and Knowledge Discovery
The multi-multiway cut problem

Theoretical Computer Science
How to rank with few errors

Proceedings of the thirty-ninth annual ACM symposium on Theory of computing
A rigorous analysis of population stratification with limited data

SODA '07 Proceedings of the eighteenth annual ACM-SIAM symposium on Discrete algorithms
A discriminative framework for clustering via similarity functions

STOC '08 Proceedings of the fortieth annual ACM symposium on Theory of computing
On the Approximation of Correlation Clustering and Consensus Clustering

Journal of Computer and System Sciences
Aggregating inconsistent information: Ranking and clustering

Journal of the ACM (JACM)
Non-negative matrix factorization for semi-supervised data clustering

Knowledge and Information Systems
Semi-supervised graph clustering: a kernel approach

Machine Learning
Steps toward managing lineage metadata in grid clusters

TAPP'09 First workshop on on Theory and practice of provenance
Correlation Clustering Revisited: The "True" Cost of Error Minimization Problems

ICALP '09 Proceedings of the 36th International Colloquium on Automata, Languages and Programming: Part I
Deterministic Pivoting Algorithms for Constrained Ranking and Clustering Problems

Mathematics of Operations Research
Average correlation clustering algorithm (ACCA) for grouping of co-regulated genes with similar pattern of variation in their expression values

Journal of Biomedical Informatics
Improved consensus clustering via linear programming

ACSC '10 Proceedings of the Thirty-Third Australasian Conferenc on Computer Science - Volume 102
Towards a more discriminative and semantic visual vocabulary

Computer Vision and Image Understanding
Information Theoretic Measures for Clusterings Comparison: Variants, Properties, Normalization and Correction for Chance

The Journal of Machine Learning Research
Clustering causal relationships in genes expression data

WIRN'05 Proceedings of the 16th Italian conference on Neural Nets
Correlation clustering and consensus clustering

ISAAC'05 Proceedings of the 16th international conference on Algorithms and Computation
Approximating the best-fit tree under Lp norms

APPROX'05/RANDOM'05 Proceedings of the 8th international workshop on Approximation, Randomization and Combinatorial Optimization Problems, and Proceedings of the 9th international conference on Randamization and Computation: algorithms and techniques
Extending the tractability border for closest leaf powers

WG'05 Proceedings of the 31st international conference on Graph-Theoretic Concepts in Computer Science
Partitioning signed bipartite graphs for classification of individuals and organizations

SBP'12 Proceedings of the 5th international conference on Social Computing, Behavioral-Cultural Modeling and Prediction
Hedonic clustering games

Proceedings of the twenty-fourth annual ACM symposium on Parallelism in algorithms and architectures
On the complexity of Newman's community finding approach for biological and social networks

Journal of Computer and System Sciences
On the approximability of maximum and minimum edge clique partition problems

CATS '06 Proceedings of the Twelfth Computing: The Australasian Theory Symposium - Volume 51

Quantified Score

Hi-index	0.01

Visualization

Abstract

We consider the problem of clustering a collection of elements based on pairwise judgments of similarity and dissimilarity. Bansal, Blum and Chawla [1] cast the problem thus: given a graph G whose edges are labeled "+" (similar) or "-" (dissimilar), partition the vertices into clusters so that the number of pairs correctly (resp. incorrectly) classified with respect to the input labeling is maximized (resp. minimized). Complete graphs, where the classifier labelsevery edge, and general graphs, where some edges are not labeled, are both worth studying. We answer several questions left open in [1] and provide a sound overview of clustering with qualitative information.We give a factor 4 approximation for minimization on complete graphs, and a factor O(log n) approximation for general graphs. For the maximization version, a PTAS for complete graphs is shown in [1]; we give a factor 0.7664 approximation for general graphs, noting that a PTAS is unlikely by proving APX-hardness. We also prove the APX-hardness of minimization on complete graphs.