A note on the inapproximability of correlation clustering

Authors:
Jinsong Tan
Affiliations:
Department of Computer and Information Sciences, University of Pennsylvania, Philadelphia, PA 19104, USA
Venue:
Information Processing Letters
Year:
2008

Citing 6
Cited 0

Toward Efficient Agnostic Learning

Machine Learning - Special issue on computational learning theory, COLT'92
Cluster Graph Modification Problems

WG '02 Revised Papers from the 28th International Workshop on Graph-Theoretic Concepts in Computer Science
Correlation Clustering: maximizing agreements via semidefinite programming

SODA '04 Proceedings of the fifteenth annual ACM-SIAM symposium on Discrete algorithms
Correlation Clustering

Machine Learning
Aggregating inconsistent information: ranking and clustering

Proceedings of the thirty-seventh annual ACM symposium on Theory of computing
Clustering with qualitative information

Journal of Computer and System Sciences - Special issue: Learning theory 2003

Quantified Score

Hi-index	0.89

Visualization

Abstract

We consider inapproximability of the correlation clustering problem defined as follows: Given a graph G=(V,E) where each edge is labeled either ''+'' (similar) or ''-'' (dissimilar), correlation clustering seeks to partition the vertices into clusters so that the number of pairs correctly (resp., incorrectly) classified with respect to the labels is maximized (resp., minimized). The two complementary problems are called MaxAgree and MinDisagree, respectively, and have been studied on complete graphs, where every edge is labeled, and general graphs, where some edge might not have been labeled. Natural edge-weighted versions of both problems have been studied as well. Let S-MaxAgree denote the weighted problem where all weights are taken from set S, we show that S-MaxAgree with weights bounded by O(|V|^1^/^2^-^@d) essentially belongs to the same hardness class in the following sense: if there is a polynomial time algorithm that approximates S-MaxAgree within a factor of @l=O(log|V|) with high probability, then for any choice of S^', S^'-MaxAgree can be approximated in polynomial time within a factor of (@l+@e), where @e0 can be arbitrarily small, with high probability. A similar statement also holds for S-MinDisagree. This result implies it is hard (assuming NPRP) to approximate unweighted MaxAgree within a factor of 80/79-@e, improving upon a previous known factor of 116/115-@e by Charikar et al. [M. Charikar, V. Guruswami, A. Wirth, Clustering with qualitative information, Journal of Computer and System Sciences 71 (2005) 360-383].