A unified approach to word occurrence probabilities
Discrete Applied Mathematics - Special volume on combinatorial molecular biology
Poisson Approximation for the Non-Overlapping Appearances of Several Words in Markov Chains
Combinatorics, Probability and Computing
RECOMB '04 Proceedings of the eighth annual international conference on Resaerch in computational molecular biology
Applied Combinatorics on Words (Encyclopedia of Mathematics and its Applications)
Applied Combinatorics on Words (Encyclopedia of Mathematics and its Applications)
Numerical inversion of probability generating functions
Operations Research Letters
Note: A note on occurrence of gapped patterns in i.i.d. sequences
Discrete Applied Mathematics
Speeding up exact motif discovery by bounding the expected clump size
WABI'10 Proceedings of the 10th international conference on Algorithms in bioinformatics
Occurrence of structured motifs in random sequences: Arbitrary number of boxes
Discrete Applied Mathematics
Hi-index | 0.04 |
This paper provides exact probability results for waiting times associated with occurrences of two types of motifs in a random sequence. First, we provide an explicit expression for the probability generating function of the interarrival time between two clumps of a pattern. It allows, in particular, to measure the quality of the Poisson approximation which is currently used for evaluation of the distribution of the number of clumps of a pattern. Second, we provide explicit expressions for the probability generating functions of both the waiting time until the first occurrence, and the interarrival time between consecutive occurrences, of a structured motif. Distributional results for structured motifs are of interest in genome analysis because such motifs are promoter candidates. As an application, we determine significant structured motifs in a data set of DNA regulatory sequences.