An efficient motif discovery algorithm with unknown motif length and number of binding sites

  • Authors:
  • Henry C. M. Leung;Francis Y. L. Chin

  • Affiliations:
  • Department of Computer Science, The University of Hong Kong, Hong Kong, China.;Department of Computer Science, The University of Hong Kong, Hong Kong, China

  • Venue:
  • International Journal of Data Mining and Bioinformatics
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Most motif discovery algorithms from DNA sequences require the motif's length as input. Styczynski et al. introduced the Extended (l,d)-Motif Problem (EMP) where the motif's length is not an input parameter. Unfortunately, their algorithm takes an unacceptably long time to run, e.g. over 3 months to discover a length-14 motif. Since the best motif may not be the longest nor have the largest number of binding sites, in this paper we further eliminate another input parameter about the minimum number of binding sites in order to provide more realistic/robust results. We also develop an efficient algorithm to solve EMP and this redefined problem.