An Improved Approximation Algorithm for Scaffold Filling to Maximize the Common Adjacencies

  • Authors:
  • Nan Liu;Haitao Jiang;Daming Zhu;Binhai Zhu

  • Affiliations:
  • Shandong University, Jinan;Shandong University, Jinan;Shandong University, Jinan;Montana State University, Bozeman

  • Venue:
  • IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Scaffold filling is a new combinatorial optimization problem in genome sequencing. The one-sided scaffold filling problem can be described as given an incomplete genome $(I)$ and a complete (reference) genome $(G)$, fill the missing genes into $(I)$ such that the number of common (string) adjacencies between the resulting genome $(I^{\prime })$ and $(G)$ is maximized. This problem is NP-complete for genome with duplicated genes and the best known approximation factor is 1.33, which uses a greedy strategy. In this paper, we prove a better lower bound of the optimal solution, and devise a new algorithm by exploiting the maximum matching method and a local improvement technique, which improves the approximation factor to 1.25. For genome with gene repetitions, this is the only known NP-complete problem which admits an approximation with a small constant factor (less than 1.5).