Distinct squares in run-length encoded strings

  • Authors:
  • J. J. Liu

  • Affiliations:
  • -

  • Venue:
  • Theoretical Computer Science
  • Year:
  • 2010

Quantified Score

Hi-index 5.23

Visualization

Abstract

Squares are strings of the form ww where w is any nonempty string. Two squares ww and w^'w^' are of different types if and only if ww^'. Fraenkel and Simpson [Avieri S. Fraenkel, Jamie Simpson, How many squares can a string contain? Journal of Combinatorial Theory, Series A 82 (1998) 112-120] proved that the number of square types contained in a string of length n is bounded by O(n). The set of all different square types contained in a string is called the vocabulary of the string. If a square can be obtained by a series of successive right-rotations from another square, then we say the latter covers the former. A square is called a c-square if no square with a smaller index can cover it and it is not a trivial square. The set containing all c-squares is called the covering set. Note that every string has a unique covering set. Furthermore, the vocabulary of the covering set are called c-vocabulary. In this paper, we prove that the cardinality of c-vocabulary in a string is less than 143N, where N is the number of runs in this string.