More Efficient Algorithms for Closest String and Substring Problems

  • Authors:
  • Bin Ma;Xiaoming Sun

  • Affiliations:
  • binma@uwaterloo.ca;xiaomings@tsinghua.edu.cn

  • Venue:
  • SIAM Journal on Computing
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

The closest string problem and the closest substring problem are all natural theoretical computer science problems and find important applications in computational biology. Given $n$ input strings, the closest string (substring) problem finds a new string within distance $d$ to (a substring of) each input string and such that $d$ is minimized. Both problems are NP-complete. In this paper we propose new algorithms for these two problems. For the closest string problem, we developed an exact algorithm with time complexity $O(n|\Sigma|^{O(d)})$, where $\Sigma$ is the alphabet. This improves the previously best known result $O(nd^{O(d)})$ and results into a polynomial time algorithm when $d=O(\log n)$. By using this algorithm, a polynomial time approximation scheme (PTAS) for the closest string problem is also given with time complexity $O(n^{O(\epsilon^{-2})})$, improving the previously best known $O(n^{O(\epsilon^{-2}\log\frac{1}{\epsilon})})$ PTAS. A new algorithm for the closest substring problem is also proposed. Finally, we prove that a restricted version of the closest substring problem has the same parameterized complexity as the closest substring, answering an open question in the literature.