Average running time of the Boyer-Moore-Horspool algorithm
Theoretical Computer Science - Selected papers of the Combinatorial Pattern Matching School
Analysis of Boyer-Moore-Horspool string-matching heuristic
Random Structures & Algorithms - Special issue: average-case analysis of algorithms
Analysis of Boyer-Moore-type string searching algorithms
SODA '90 Proceedings of the first annual ACM-SIAM symposium on Discrete algorithms
A unified approach to word occurrence probabilities
Discrete Applied Mathematics - Special volume on combinatorial molecular biology
Re-describing an algorithm by Hopcroft
Theoretical Computer Science
A fast string searching algorithm
Communications of the ACM
Efficient string matching: an aid to bibliographic search
Communications of the ACM
The Boyer-Moore-Horspool heuristic with Markovian input
Random Structures & Algorithms
Flexible pattern matching in strings: practical on-line search algorithms for texts and biological sequences
Theoretical Computer Science
Designing seeds for similarity search in genomic DNA
RECOMB '03 Proceedings of the seventh annual international conference on Research in computational molecular biology
Efficient Experimental String Matching by Weak Factor Recognition
CPM '01 Proceedings of the 12th Annual Symposium on Combinatorial Pattern Matching
A Statistical Model of Proteolytic Digestion
CSB '03 Proceedings of the IEEE Computer Society Conference on Bioinformatics
Sensitivity analysis and efficient method for identifying optimal spaced seeds
Journal of Computer and System Sciences
IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB)
Vector seeds: An extension to spaced seeds
Journal of Computer and System Sciences - Special issue on bioinformatics II
Good spaced seeds for homology search
Bioinformatics
Applied Combinatorics on Words (Encyclopedia of Mathematics and its Applications)
Applied Combinatorics on Words (Encyclopedia of Mathematics and its Applications)
Superiority and complexity of the spaced seeds
SODA '06 Proceedings of the seventeenth annual ACM-SIAM symposium on Discrete algorithm
Average case analysis of the Boyer-Moore algorithm
Random Structures & Algorithms
Indel seeds for homology search
Bioinformatics
Computing exact P-values for DNA motifs
Bioinformatics
Probabilistic Arithmetic Automata and Their Application to Pattern Matching Statistics
CPM '08 Proceedings of the 19th annual symposium on Combinatorial Pattern Matching
Fast and Adaptive Variable Order Markov Chain Construction
WABI '08 Proceedings of the 8th international workshop on Algorithms in Bioinformatics
Computing Alignment Seed Sensitivity with Probabilistic Arithmetic Automata
WABI '08 Proceedings of the 8th international workshop on Algorithms in Bioinformatics
Efficient exact motif discovery
Bioinformatics
Construction of Aho Corasick automaton in linear time for integer alphabets
Information Processing Letters
Markov additive chains and applications to fragment statistics for peptide mass fingerprinting
RECOMB'06 Proceedings of the joint 2006 satellite conference on Systems biology and computational proteomics
Construction of minimal deterministic finite automata from biological motifs
Theoretical Computer Science
CPM'06 Proceedings of the 17th Annual conference on Combinatorial Pattern Matching
LATA'10 Proceedings of the 4th international conference on Language and Automata Theory and Applications
Hi-index | 0.00 |
We present a comprehensive review on probabilistic arithmetic automata (PAAs), a general model to describe chains of operations whose operands depend on chance, along with two algorithms to numerically compute the distribution of the results of such probabilistic calculations. PAAs provide a unifying framework to approach many problems arising in computational biology and elsewhere. We present five different applications, namely 1) pattern matching statistics on random texts, including the computation of the distribution of occurrence counts, waiting times, and clump sizes under hidden Markov background models; 2) exact analysis of window-based pattern matching algorithms; 3) sensitivity of filtration seeds used to detect candidate sequence alignments; 4) length and mass statistics of peptide fragments resulting from enzymatic cleavage reactions; and 5) read length statistics of 454 and IonTorrent sequencing reads. The diversity of these applications indicates the flexibility and unifying character of the presented framework. While the construction of a PAA depends on the particular application, we single out a frequently applicable construction method: We introduce deterministic arithmetic automata (DAAs) to model deterministic calculations on sequences, and demonstrate how to construct a PAA from a given DAA and a finite-memory random text model. This procedure is used for all five discussed applications and greatly simplifies the construction of PAAs. Implementations are available as part of the MoSDi package. Its application programming interface facilitates the rapid development of new applications based on the PAA framework.