An efficient formula for linear recurrences
SIAM Journal on Computing
Automata for matching patterns
Handbook of formal languages, vol. 2
A unified approach to word occurrence probabilities
Discrete Applied Mathematics - Special volume on combinatorial molecular biology
Efficient string matching: an aid to bibliographic search
Communications of the ACM
Introduction to Automata Theory, Languages and Computability
Introduction to Automata Theory, Languages and Computability
Theoretical Computer Science
Assessing the Statistical Significance of Overrepresented Oligonucleotides
WABI '01 Proceedings of the First International Workshop on Algorithms in Bioinformatics
High-order lifting and integrality certification
Journal of Symbolic Computation - Special issue: International symposium on symbolic and algebraic computation (ISSAC 2002)
On exact and approximate interpolation of sparse rational functions
Proceedings of the 2007 international symposium on Symbolic and algebraic computation
Regular expressions at their best: a case for rational design
CIAA'10 Proceedings of the 15th international conference on Implementation and application of automata
Assessing the significance of sets of words
CPM'05 Proceedings of the 16th annual conference on Combinatorial Pattern Matching
A unified construction of the glushkov, follow, and antimirov automata
MFCS'06 Proceedings of the 31st international conference on Mathematical Foundations of Computer Science
Hi-index | 5.23 |
We present two novel approaches for the computation of the exact distribution of a pattern in a long sequence. Both approaches take into account the sparse structure of the problem and are two-part algorithms. The first approach relies on a partial recursion after a fast computation of the second largest eigenvalue of the transition matrix of a Markov chain embedding. The second approach uses fast Taylor expansions of an exact bivariate rational reconstruction of the distribution. We illustrate the interest of both approaches on a simple toy example and two biological applications: the transcription factors of the Human Chromosome 10 and the PROSITE signatures of functional motifs in proteins. On these examples our methods demonstrate their complementarity and their ability to extend the domain of feasibility for exact computations in pattern problems to a new level.