Explicit inapproximability bounds for the shortest superstring problem

  • Authors:
  • Virginia Vassilevska

  • Affiliations:
  • Carnegie Mellon University, Pittsburgh, PA

  • Venue:
  • MFCS'05 Proceedings of the 30th international conference on Mathematical Foundations of Computer Science
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Given a set of strings S = {s1,..., sn}, the Shortest Superstring problem asks for the shortest string s which contains each si as a substring. We consider two measures of success in this problem: the length measure, which is the length of s, and the compression measure, which is the difference between the sum of lengths of the si and the length of s. Both the length and the compression versions of the problem are known to be MAX-SNP-hard. The only explicit approximation ratio lower bounds are by Ott: 1.000057 for the length measure and 1.000089 for the compression measure. Using a natural construction we improve these lower bounds to 1.00082 for the length measure and 1.00093 for the compression measure. Our lower bounds hold even for instances in which the strings are over a binary alphabet and have equal lengths. In fact, we show a somewhat surprising result, that the Shortest Superstring problem (with respect to both measures) is as hard to approximate on instances over a binary alphabet, as it is over any alphabet.