Binary Codes for Non-Uniform Sources

  • Authors:
  • Alistair MoffaT;Vo Ngoc Anh

  • Affiliations:
  • The University of Melbourne, Victoria, Australia;The University of Melbourne, Victoria, Australia

  • Venue:
  • DCC '05 Proceedings of the Data Compression Conference
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

In many applications of compression, decoding speed is at least as important as compression effectiveness. For example, the large inverted indexes associated with text retrieval mechanisms are best stored compressed, but a working system must also process queries at high speed. Here we present two coding methods that make use of fixed binary representations. They have all of the consequent benefits in terms of decoding performance, but are also sensitive to localized variations in the source data, and in practice give excellent compression. The methods are validated by applying them to various test data, including the index of an 18 GB document collection.