The distribution of subword counts is usually normal
European Journal of Combinatorics
An introduction to the analysis of algorithms
An introduction to the analysis of algorithms
Average Case Analysis of Algorithms on Sequences
Average Case Analysis of Algorithms on Sequences
Size and path length of Patricia tries: dynamical sources context
Random Structures & Algorithms - Special issue on analysis of algorithms dedicated to Don Knuth on the occasion of his (100)8th birthday
Theoretical Computer Science
On the Approximate Pattern Occurrences in a Text
SEQUENCES '97 Proceedings of the Compression and Complexity of Sequences 1997
Journal of the ACM (JACM)
Counting occurrences for a finite set of words: Combinatorial methods
ACM Transactions on Algorithms (TALG)
Hi-index | 0.00 |
In pattern matching algorithms, two characteristic parameters play an important rôle : the number of occurrences of a given pattern, and the number of positions where a pattern occurrence ends. Since there may exist many occurrences which end at the same position, these two parameters may differ in a significant way. Here, we consider a general framework where the text is produced by a probabilistic source, which can be built by a dynamical system. Such “dynamical sources” encompass the classical sources –memoryless sources, and Markov chains–, and may possess a high degree of correlations. We are mainly interested in two situations : the pattern is a general word of a regular expression, and we study the number of occurrence positions – the pattern is a finite set of strings, and we study the number of occurrences. In both cases, we determine the mean and the variance of the parameter, and prove that its distribution is asymptotically Gaussian. In this way, we extend methods and results which have been already obtained for classical sources [for instance in [9] and in [6] to this general “dynamical” framework. Our methods use various techniques: formal languages, and generating functions, as in previous works. However, in this correlated model, it is not possible to use a direct transfer into generating functions, and we mainly deal with generating operators which generate... generating functions.