Repetition Complexity of Words

Authors:
Lucian Ilie;Sheng Yu;Kaizhong Zhang
Affiliations:
-;-;-
Venue:
COCOON '02 Proceedings of the 8th Annual International Conference on Computing and Combinatorics
Year:
2002

Citing 9
Cited 1

An O(n log n) algorithm for finding all repetitions in a string

Journal of Algorithms
Detecting leftmost maximal periodicities

Discrete Applied Mathematics - Combinatorics and complexity
Text algorithms

Text algorithms
Combinatorics of words

Handbook of formal languages, vol. 1
Information-Theoretic Limitations of Formal Systems

Journal of the ACM (JACM)
Computers and Intractability: A Guide to the Theory of NP-Completeness

Computers and Intractability: A Guide to the Theory of NP-Completeness
Compression and Entropy

STACS '92 Proceedings of the 9th Annual Symposium on Theoretical Aspects of Computer Science
Finding Maximal Repetitions in a Word in Linear Time

FOCS '99 Proceedings of the 40th Annual Symposium on Foundations of Computer Science
The macro model for data compression (Extended Abstract)

STOC '78 Proceedings of the tenth annual ACM symposium on Theory of computing

On average sequence complexity

Theoretical Computer Science

Quantified Score

Hi-index	0.00

Visualization

Abstract

With ideas from data compression and combinatorics on words, we introduce a complexity measure for words, called repetition complexity, which quantifies the amount of repetition in a word. The repetition complexity of w, r(w), is defined as the smallest amount of space needed to store w when reduced by repeatedly applying the following procedure: n consecutive occurrences uu . . . u of the same subword u of w are stored as (u, n). The repetition complexity has interesting relations with well-known complexity measures, such as subword complexity, sub, and Lempel-Ziv complexity, lz. We have always r(w) = lz(w) and could even be that the former is linear while the latter is only logarithmic; e.g., this happens for prefixes of certain infinite words obtained by iterated morphisms. An infinite word a being ultimately periodic is equivalent to: (i) sub(prefn(驴)) = O(n), (ii) lz(prefn(驴)) = O(1), and (iii) r(prefn(驴)) = lgn + O(1). De Bruijn words, well known for their high subword complexity are shown to have almost highest repetition complexity; the precise complexity remains open. r(w) can be computed in time O(n3(log n)2) and it is open, and probably very difficult, to find very fast algorithms.