Inapproximability of maximal strip recovery

  • Authors:
  • Minghui Jiang

  • Affiliations:
  • -

  • Venue:
  • Theoretical Computer Science
  • Year:
  • 2011

Quantified Score

Hi-index 5.23

Visualization

Abstract

In comparative genomics, the first step of sequence analysis is usually to decompose two or more genomes into syntenic blocks that are segments of homologous chromosomes. For the reliable recovery of syntenic blocks, noise and ambiguities in the genomic maps need to be removed first. Maximal Strip Recovery (MSR) is an optimization problem proposed by Zheng, Zhu, and Sankoff for reliably recovering syntenic blocks from genomic maps in the midst of noise and ambiguities. Given d genomic maps as sequences of gene markers, the objective of MSR-d is to find d subsequences, one subsequence of each genomic map, such that the total length of syntenic blocks in these subsequences is maximized. For any constant d=2, a polynomial-time 2d-approximation for MSR-d was previously known. In this paper, we show that for any d=2, MSR-d is APX-hard, even for the most basic version of the problem in which all gene markers are distinct and appear in positive orientation in each genomic map. Moreover, we provide the first explicit lower bounds on approximating MSR-d for all d=2. In particular, we show that MSR-d is NP-hard to approximate within @W(d/logd). From the other direction, we show that the previous 2d-approximation for MSR-d can be optimized into a polynomial-time algorithm even if d is not a constant but is part of the input. We then extend our inapproximability results to several related problems including CMSR-d, @d-gap-MSR-d, and @d-gap-CMSR-d.