Word-Based Compression Methods and Indexing for Text Retrieval Systems

  • Authors:
  • Jiri Dvorský;Jaroslav Pokorný;Václav Snásel

  • Affiliations:
  • -;-;-

  • Venue:
  • ADBIS '99 Proceedings of the Third East European Conference on Advances in Databases and Information Systems
  • Year:
  • 1999

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this article we present a new compression method, called WLZW, which is a word-based modification of classic LZW. The modification is similar to the approach used in the HuffWord compression algorithm. The algorithm is two-phase, the compression ratio achieved is fairly good, on average 22%-20% (see [2],[3]). Moreover, the table of words, which is side product of compression, can be used to create full-text index, especially for dynamic text databases. Overhead of the index is good.