Chromatic correlation clustering

Authors:
Francesco Bonchi;Aristides Gionis;Francesco Gullo;Antti Ukkonen
Affiliations:
Yahoo! Research, Barcelona, Spain;Yahoo! Research, Barcelona, Spain;Yahoo! Research, Barcelona, Spain;Yahoo! Research, Barcelona, Spain
Venue:
Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining
Year:
2012

Citing 14
Cited 1

Latent dirichlet allocation

The Journal of Machine Learning Research
Correlation Clustering

Machine Learning
Correlation clustering with a fixed number of clusters

SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
Aggregating inconsistent information: Ranking and clustering

Journal of the ACM (JACM)
Correlation Clustering Revisited: The "True" Cost of Error Minimization Problems

ICALP '09 Proceedings of the 36th International Colloquium on Automata, Languages and Programming: Part I
Uncoverning Groups via Heterogeneous Interaction Analysis

ICDM '09 Proceedings of the 2009 Ninth IEEE International Conference on Data Mining
Computing label-constraint reachability in graph databases

Proceedings of the 2010 ACM SIGMOD International Conference on Management of data
Graph indexing of road networks for shortest path queries with label restrictions

Proceedings of the VLDB Endowment
Adding regular expressions to graph reachability and pattern queries

ICDE '11 Proceedings of the 2011 IEEE 27th International Conference on Data Engineering
Latent clustering on graphs with multiple edge types

WAW'11 Proceedings of the 8th international conference on Algorithms and models for the web graph
Finding and Characterizing Communities in Multidimensional Networks

ASONAM '11 Proceedings of the 2011 International Conference on Advances in Social Networks Analysis and Mining
Answering label-constraint reachability in large graphs

Proceedings of the 20th ACM international conference on Information and knowledge management
Overlapping Correlation Clustering

ICDM '11 Proceedings of the 2011 IEEE 11th International Conference on Data Mining
Community detection via heterogeneous interaction analysis

Data Mining and Knowledge Discovery

Cascade-based community detection

Proceedings of the sixth ACM international conference on Web search and data mining

Quantified Score

Hi-index	0.00

Visualization

Abstract

We study a novel clustering problem in which the pairwise relations between objects are categorical. This problem can be viewed as clustering the vertices of a graph whose edges are of different types (colors). We introduce an objective function that aims at partitioning the graph such that the edges within each cluster have, as much as possible, the same color. We show that the problem is NP-hard and propose a randomized algorithm with approximation guarantee proportional to the maximum degree of the input graph. The algorithm iteratively picks a random edge as pivot, builds a cluster around it, and removes the cluster from the graph. Although being fast, easy-to-implement, and parameter free, this algorithm tends to produce a relatively large number of clusters. To overcome this issue we introduce a variant algorithm, which modifies how the pivot is chosen and and how the cluster is built around the pivot. Finally, to address the case where a fixed number of output clusters is required, we devise a third algorithm that directly optimizes the objective function via a strategy based on the alternating minimization paradigm. We test our algorithms on synthetic and real data from the domains of protein-interaction networks, social media, and bibliometrics. Experimental evidence show that our algorithms outperform a baseline algorithm both in the task of reconstructing a ground-truth clustering and in terms of objective function value.