Algorithms for computing the length-constrained max-score segments with applications to DNA copy number data analysis

  • Authors:
  • Hsiao-Fei Liu;Peng-An Chen;Kun-Mao Chao

  • Affiliations:
  • Department of Computer Science and Information Engineering, National Taiwan University, Taipei, Taiwan;Department of Computer Science and Information Engineering, National Taiwan University, Taipei, Taiwan;Dept. of Computer Sci. and Information Eng., National Taiwan Univ., Taipei, Taiwan and Graduate Institute of Biomedical Electronics and Bioinformatics and Graduate Institute of Networking and Mult ...

  • Venue:
  • ISAAC'07 Proceedings of the 18th international conference on Algorithms and computation
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Given a sequence of n real numbers A = (a1, a2,..., an), two integers L and U with 1 ≤ L ≤ U ≤ n, and a score function f : IR+ × IR → IR, the LENGTH-CONSTRAINED MAX-SCORE SEGMENT PROBLEM is to find a segment A[i, j] = (ai, ai+1,..., aj) maximizing f(j - i + 1, Σh=ij ah) subject to j - i + 1 ∈ [L, U]. In this paper, we solve the LENGTH-CONSTRAINED MAX-SCORE SEGMENT PROBLEM for the case where the given score function f(l, w) = w/r√l for any constant r 1. Our algorithm runs in O(n T(L1/2)/L1/2) time, where T(n′) is the time required to solve the all-pairs shortest paths problem on a graph of n′ nodes. By the latest result of Chan [7], T(n′) = O(n′3 (log log n′)3/(log n′)2), so our algorithm runs in subquadratic time O(nL (log log L)3/(log L)2). Lipson et al. [21] studied a more restricted case where the score function f(l,w) = w/2√l and there are no length constraints, i.e., L = 1 and U = n. They also showed how to apply their algorithm to analyzing DNA copy number data. However, their algorithm takes Ω(n2) time in the worst situation. Since the length lower bound L for the case considered by Lipson et al. is a constant, our algorithm solves it in O(n) time.