Minimum common string partition problem: hardness and approximations

  • Authors:
  • Avraham Goldstein;Petr Kolman;Jie Zheng

  • Affiliations:
  • No Institute Given;Institute for Theoretical Computer Science, Charles University, Praha 1, Czech Republic;Department of Computer Science, University of California, Riverside, CA

  • Venue:
  • ISAAC'04 Proceedings of the 15th international conference on Algorithms and Computation
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

String comparison is a fundamental problem in computer science, with applications in areas such as computational biology, text processing or compression In this paper we address the minimum common string partition problem, a string comparison problem with tight connection to the problem of sorting by reversals with duplicates, a key problem in genome rearrangement. A partition of a string A is a sequence ${\mathcal P}=(P_{1},P_{2},...P_{m})$ of strings, called the blocks, whose concatenation is equal to A Given a partition ${\mathcal P}$ of a string A and a partition ${\mathcal Q}$ of a string B, we say that the pair $\langle\mathcal{P,Q}\rangle$ is a common partition of A and B if ${\mathcal Q}$ is a permutation of ${\mathcal P}$ The minimum common string partition problem (MCSP) is to find a common partition of two strings A and B with the minimum number of blocks The restricted version of MCSP where each letter occurs at most k times in each input string, is denoted by k-MCSP. In this paper, we show that 2-MCSP (and therefore MCSP) is NP-hard and, moreover, even APX-hard We describe a 1.1037-approximation for 2-MCSP and a linear time 4-approximation algorithm for 3-MCSP We are not aware of any better approximations.