A note on the inapproximability of correlation clustering

  • Authors:
  • Jinsong Tan

  • Affiliations:
  • Department of Computer and Information Sciences, University of Pennsylvania, Philadelphia, PA 19104, USA

  • Venue:
  • Information Processing Letters
  • Year:
  • 2008

Quantified Score

Hi-index 0.89

Visualization

Abstract

We consider inapproximability of the correlation clustering problem defined as follows: Given a graph G=(V,E) where each edge is labeled either ''+'' (similar) or ''-'' (dissimilar), correlation clustering seeks to partition the vertices into clusters so that the number of pairs correctly (resp., incorrectly) classified with respect to the labels is maximized (resp., minimized). The two complementary problems are called MaxAgree and MinDisagree, respectively, and have been studied on complete graphs, where every edge is labeled, and general graphs, where some edge might not have been labeled. Natural edge-weighted versions of both problems have been studied as well. Let S-MaxAgree denote the weighted problem where all weights are taken from set S, we show that S-MaxAgree with weights bounded by O(|V|^1^/^2^-^@d) essentially belongs to the same hardness class in the following sense: if there is a polynomial time algorithm that approximates S-MaxAgree within a factor of @l=O(log|V|) with high probability, then for any choice of S^', S^'-MaxAgree can be approximated in polynomial time within a factor of (@l+@e), where @e0 can be arbitrarily small, with high probability. A similar statement also holds for S-MinDisagree. This result implies it is hard (assuming NPRP) to approximate unweighted MaxAgree within a factor of 80/79-@e, improving upon a previous known factor of 116/115-@e by Charikar et al. [M. Charikar, V. Guruswami, A. Wirth, Clustering with qualitative information, Journal of Computer and System Sciences 71 (2005) 360-383].