Assessing the significance of sets of words

  • Authors:
  • Valentina Boeva;Julien Clément;Mireille Régnier;Mathias Vandenbogaert

  • Affiliations:
  • Moscow State University, Vorob'evy Gory, Russia;Igm, Université de Marne-la-Vallée, France;Inria, Le Chesnay, France;Biozentrum, Basel Universitat, Switzerland

  • Venue:
  • CPM'05 Proceedings of the 16th annual conference on Combinatorial Pattern Matching
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

Various criteria have been defined to evaluate the significance of sets of words, the computation of them often being difficult. We provide explicit expressions for the waiting time in such a context. In order to assess the significance of a cluster of potential binding sites, we extend them to the co-occurrence problem. We point out that these criteria values depend on a few fundamental parameters. We provide efficient algorithms to compute them, that rely on a combinatorial interpretation of the formulae. We show that our results are very tight in the so-called twilight zone and improve on previous rough approximations. One assumes that the text is generated according to a Markov stationary process. These results are developed for an extended model of consensus.