Counting occurrences for a finite set of words: Combinatorial methods

Authors:
Frédérique Bassino;Julien Clément;Pierre Nicodème
Affiliations:
Université de Paris 13, France;Université de Caen, France;École Polytechnique, France
Venue:
ACM Transactions on Algorithms (TALG)
Year:
2012

Citing 14
Cited 0

The distribution of subword counts is usually normal

European Journal of Combinatorics
Autocorrelation on words and its applications: analysis of suffix trees by string-ruler approach

Journal of Combinatorial Theory Series A
An introduction to the analysis of algorithms

An introduction to the analysis of algorithms
A unified approach to word occurrence probabilities

Discrete Applied Mathematics - Special volume on combinatorial molecular biology
Regular Article: The Goulden驴Jackson Cluster Method for Cyclic Words

Advances in Applied Mathematics
Efficient string matching: an aid to bibliographic search

Communications of the ACM
Average Case Analysis of Algorithms on Sequences

Average Case Analysis of Algorithms on Sequences
Motif statistics

Theoretical Computer Science
On the Approximate Pattern Occurrences in a Text

SEQUENCES '97 Proceedings of the Compression and Complexity of Sequences 1997
Regexpcount, a symbolic package for counting problems on regular expressions and words

Fundamenta Informaticae - Special issue on computing patterns in strings
Applied Combinatorics on Words (Encyclopedia of Mathematics and its Applications)

Applied Combinatorics on Words (Encyclopedia of Mathematics and its Applications)
Algorithms on Strings

Algorithms on Strings
Analytic Combinatorics

Analytic Combinatorics
Pattern matching statistics on correlated sources

LATIN'06 Proceedings of the 7th Latin American conference on Theoretical Informatics

Quantified Score

Hi-index	0.00

Visualization

Abstract

In this article, we provide the multivariate generating function counting texts according to their length and to the number of occurrences of words from a finite set. The application of the inclusion-exclusion principle to word counting due to Goulden and Jackson [1979, 1983] is used to derive the result. Unlike some other techniques which suppose that the set of words is reduced (i.e., where no two words are factor of one another), the finite set can be chosen arbitrarily. Noonan and Zeilberger [1999] already provided a Maple package treating the nonreduced case, without giving an expression of the generating function or a detailed proof. We provide a complete proof validating the use of the inclusion-exclusion principle. Some formulæ for expected values, variance, and covariance for number of occurrences when considering two arbitrary sets of finite words are given as an application of our methodology.