A note on the estimation of string complexity for short strings

  • Authors:
  • Ulrich Speidel

  • Affiliations:
  • Department of Computer Science, The University of Auckland, Auckland, New Zealand

  • Venue:
  • ICICS'09 Proceedings of the 7th international conference on Information, communications and signal processing
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

While Kolmogorov's well-known results show that absolute string complexity is not computable, its estimation is nevertheless of importance in fields such as randomness testing, data compression evaluation, similarity measurement and event detection. Several complexity estimators have been developed over the years, with the Lempel-Ziv parsers being the most prominent. The estimators' asymptotic behaviour for long strings is generally compatible, but they differ considerably in the domain of short strings. Short strings are however exactly the kind of data encountered in many of the practical application areas. This paper proposes a method for evaluating the comparative performance of such estimators and presents experimental results for Lempel-Ziv parsers and a more recent estimator, the T-complexity.