Indexing text using the Ziv-Lempel trie

  • Authors:
  • Gonzalo Navarro

  • Affiliations:
  • Department of Computer Science, Univ. of Chile, Blanco Encalada 2120, Santiago, Chile

  • Venue:
  • Journal of Discrete Algorithms - SPIRE 2002
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Let a text of u characters over an alphabet of size σ be compressible to n phrases by the LZ78 algorithm. We show how to build a data structure based on the Ziv-Lempel trie, called the LZ-index, that takes 4n log2 n (1 + o(1)) bits of space (that is, 4 times the entropy of the text for ergodic sources) and reports the R occurrences of a pattern of length m in worst case time O (m3 log σ + (m + R) log n). We present a practical implementation of the LZ-index, which is faster than current alternatives when we take into consideration the time to report the positions or text contexts of the occurrences found.