A PTAS for Distinguishing (Sub)string Selection

  • Authors:
  • Xiaotie Deng;Guojun Li;Zimao Li;Bin Ma;Lusheng Wang

  • Affiliations:
  • -;-;-;-;-

  • Venue:
  • ICALP '02 Proceedings of the 29th International Colloquium on Automata, Languages and Programming
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

Consider two sets of strings, B (bad genes) and G (good genes), as well as two integers db and dg (db 驴 dg). A frequently occurring problem in computational biology (andother fields) is to finda (distinguishing) substring s of length L that distinguishes the bad strings from goodstrings, i.e., for each string si 驴 B there exists a length-L substring ti of si with d(s, ti) 驴 db (close to badstrings) andfor every substring ui of length L of every string gi 驴 G, d(s, ui) 驴 dg (far from goodstrings). We present a polynomial time approximation scheme to settle the problem, i.e., for any constant 驴 0, the algorithm finds a string s of length L such that for every si 驴 B, there is a length-L substring ti of si with d(ti, s) 驴 (1+驴)db and for every substring ui of length L of every gi 驴 G, d(ui, s) 驴 (1 - 驴)dg, if a solution to the original pair (db 驴 dg) exists.