The distribution of subword counts is usually normal
European Journal of Combinatorics
Autocorrelation on words and its applications: analysis of suffix trees by string-ruler approach
Journal of Combinatorial Theory Series A
An introduction to the analysis of algorithms
An introduction to the analysis of algorithms
A unified approach to word occurrence probabilities
Discrete Applied Mathematics - Special volume on combinatorial molecular biology
Regular Article: The Goulden驴Jackson Cluster Method for Cyclic Words
Advances in Applied Mathematics
Efficient string matching: an aid to bibliographic search
Communications of the ACM
Average Case Analysis of Algorithms on Sequences
Average Case Analysis of Algorithms on Sequences
Theoretical Computer Science
On the Approximate Pattern Occurrences in a Text
SEQUENCES '97 Proceedings of the Compression and Complexity of Sequences 1997
Regexpcount, a symbolic package for counting problems on regular expressions and words
Fundamenta Informaticae - Special issue on computing patterns in strings
Applied Combinatorics on Words (Encyclopedia of Mathematics and its Applications)
Applied Combinatorics on Words (Encyclopedia of Mathematics and its Applications)
Algorithms on Strings
Analytic Combinatorics
Pattern matching statistics on correlated sources
LATIN'06 Proceedings of the 7th Latin American conference on Theoretical Informatics
Hi-index | 0.00 |
In this article, we provide the multivariate generating function counting texts according to their length and to the number of occurrences of words from a finite set. The application of the inclusion-exclusion principle to word counting due to Goulden and Jackson [1979, 1983] is used to derive the result. Unlike some other techniques which suppose that the set of words is reduced (i.e., where no two words are factor of one another), the finite set can be chosen arbitrarily. Noonan and Zeilberger [1999] already provided a Maple package treating the nonreduced case, without giving an expression of the generating function or a detailed proof. We provide a complete proof validating the use of the inclusion-exclusion principle. Some formulæ for expected values, variance, and covariance for number of occurrences when considering two arbitrary sets of finite words are given as an application of our methodology.