An Optimal Algorithm for the Maximum-Density Segment Problem

  • Authors:
  • Kai-min Chung;Hsueh-I Lu

  • Affiliations:
  • -;-

  • Venue:
  • SIAM Journal on Computing
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

We address a fundamental problem arising from analysis of biomolecular sequences. The input consists of two numbers wmin and wmax and a sequence S of n number pairs (ai,wi) with wi 0. Let segment S(i,j) of S be the consecutive subsequence of S between indices i and j. The density of S(i,j) is d(i,j) = (ai + ai + 1 + \cdots + aj)/(wi + wi + 1 + \cdots + wj)$. The maximum-density segment problem is to find a maximum-density segment over all segments S(i,j) with wmin \leq wi + wi + 1 + \cdots + wj \leq wmax. The best previously known algorithm for the problem, due to Goldwasser, Kao, and Lu [Proceedings of the Second International Workshop on Algorithms in Bioinformatics, R. Guigó and D. Gusfield, eds., Lecture Notes in Comput. Sci. 2452, Springer-Verlag, New York, 2002, pp. 157--171], runs in O(n log(wmax- wmin+1)) time. In the present paper, we solve the problem in O(n) time. Our approach bypasses the complicated right-skew decomposition, introduced by Lin, Jiang, and Chao [J. Comput. System Sci., 65 (2002), pp. 570--586]. As a result, our algorithm has the capability to process the input sequence in an online manner, which is an important feature for dealing with genome-scale sequences. Moreover, for a type of input sequences S representable in O(m) space, we show how to exploit the sparsity of S and solve the maximum-density segment problem for S in O(m) time.