Efficient algorithms for model-based motif discovery from multiple sequences

  • Authors:
  • Bin Fu;Ming-Yang Kao;Lusheng Wang

  • Affiliations:
  • Dept. of Computer Science, University of Texas-Pan American, TX;Department of Electrical Engineering and Computer Science, Northwestern University, Evanston, IL;Department of Computer Science, The City University of Hong Kong, Kowloon, Hong Kong

  • Venue:
  • TAMC'08 Proceedings of the 5th international conference on Theory and applications of models of computation
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

We study a natural probabilistic model for motif discovery that has been used to experimentally test the quality of motif discovery programs. In thismodel, there are k background sequences, and each character in a background sequence is a random character from an alphabet Σ. A motif G = g1g2...gm is a string of m characters. Each background sequence is implanted a randomly generated approximate copy of G. For a randomly generated approximate copy b1b2...bm of G, every character is randomly generated such that the probability for bi ≠ gi is at most α. In this paper, we give the first analytical proof that multiple background sequences do help for finding subtle and faint motifs.