Improved approximation algorithms for bipartite correlation clustering

Authors:
Nir Ailon;Noa Avigdor-Elgrabli;Edo Liberty;Anke Van Zuylen
Affiliations:
Technion, Haifa, Israel;Technion, Haifa, Israel;Yahoo! Research, Haifa, Israel;Max-Planck Institut für Informatik, Saarbrücken, Germany
Venue:
ESA'11 Proceedings of the 19th European conference on Algorithms
Year:
2011

Citing 12
Cited 0

Bipartite graph partitioning and data clustering

Proceedings of the tenth international conference on Information and knowledge management
Biclustering of Expression Data

Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology
Correlation Clustering

Machine Learning
Solving cluster ensemble problems by bipartite graph partitioning

ICML '04 Proceedings of the twenty-first international conference on Machine learning
Biclustering Algorithms for Biological Data Analysis: A Survey

IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Correlation clustering with a fixed number of clusters

SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
Clustering with qualitative information

Journal of Computer and System Sciences - Special issue: Learning theory 2003
Correlation clustering in general weighted graphs

Theoretical Computer Science - Approximation and online algorithms
Aggregating inconsistent information: Ranking and clustering

Journal of the ACM (JACM)
Correlation Clustering Revisited: The "True" Cost of Error Minimization Problems

ICALP '09 Proceedings of the 36th International Colloquium on Automata, Languages and Programming: Part I
Deterministic Pivoting Algorithms for Constrained Ranking and Clustering Problems

Mathematics of Operations Research
Improved algorithms for bicluster editing

TAMC'08 Proceedings of the 5th international conference on Theory and applications of models of computation

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this work we study the problem of Bipartite Correlation Clustering (BCC), a natural bipartite counterpart of the well studied Correlation Clustering (CC) problem. Given a bipartite graph, the objective of BCC is to generate a set of vertex-disjoint bi-cliques (clusters) which minimizes the symmetric difference to it. The best known approximation algorithm for BCC due to Amit (2004) guarantees an 11-approximation ratio. In this paper we present two algorithms. The first is an improved 4-approximation algorithm. However, like the previous approximation algorithm, it requires solving a large convex problem which becomes prohibitive even for modestly sized tasks. The second algorithm, and our main contribution, is a simple randomized combinatorial algorithm. It also achieves an expected 4-approximation factor, it is trivial to implement and highly scalable. The analysis extends a method developed by Ailon, Charikar and Newman in 2008, where a randomized pivoting algorithm was analyzed for obtaining a 3-approximation algorithm for CC. For analyzing our algorithm for BCC, considerably more sophisticated arguments are required in order to take advantage of the bipartite structure. Whether it is possible to achieve (or beat) the 4-approximation factor using a scalable and deterministic algorithm remains an open problem.