Frequency of symbol occurrences in simple non-primitive stochastic models

Authors:
Diego de Falco;Massimiliano Goldwurm;Violetta Lonati
Affiliations:
Università degli Studi di Milano, Dipartimento di Scienze dell'Informazione, Milano, Italy;Università degli Studi di Milano, Dipartimento di Scienze dell'Informazione, Milano, Italy;Università degli Studi di Milano, Dipartimento di Scienze dell'Informazione, Milano, Italy
Venue:
DLT'03 Proceedings of the 7th international conference on Developments in language theory
Year:
2003

Citing 7
Cited 1

Rational series and their languages

Rational series and their languages
The distribution of subword counts is usually normal

European Journal of Combinatorics
Uniform random generation of words of rational languages

Theoretical Computer Science - Special issue: selected papers from “GASCOM '94” and the “Polyominoes and Tilings” workshops
The Mathematica book (4th edition)

The Mathematica book (4th edition)
Motif Statistics

ESA '99 Proceedings of the 7th Annual European Symposium on Algorithms
On the Approximate Pattern Occurrences in a Text

SEQUENCES '97 Proceedings of the Compression and Complexity of Sequences 1997
On the number of occurrences of a symbol in words of regular languages

Theoretical Computer Science

On the maximum coefficients of rational formal series in commuting variables

DLT'04 Proceedings of the 8th international conference on Developments in Language Theory

Quantified Score

Hi-index	0.00

Visualization

Abstract

We study the random variable Yn representing the number of occurrences of a given symbol in a word of length n generated at random. The stochastic model we assume is a simple non-ergodic model defined by the product of two primitive rational formal series, which form two distinct ergodic components. We obtain asymptotic evaluations for the mean and the variance of Yn and its limit distribution. It turns out that there are two main cases: if one component is dominant and nondegenerate we get a Gaussian limit distribution; if the two components are equipotent and have different leading terms of the mean, we get a uniform limit distribution. Other particular limit distributions are obtained in the case of a degenerate dominant component and in the equipotent case when the leading terms of the expectation values are equal.