Index compression using fixed binary codewords

  • Authors:
  • Vo Ngoc Anh;Alistair Moffat

  • Affiliations:
  • The University of Melbourne, Victoria, Australia;The University of Melbourne, Victoria, Australia

  • Venue:
  • ADC '04 Proceedings of the 15th Australasian database conference - Volume 27
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Document retrieval and web search engines index large quantities of text. The static costs associated with storing the index can be traded against dynamic costs associated with using it during query evaluation. Typically, index representations that are effective and obtain good compression tend not to be efficient, in that they require more operations during query processing. In this paper we describe a scheme for compressing lists of integers as sequences of fixed binary codewords that has the twin benefits of being both effective and efficient. Experimental results are given on several large text collections to validate these claims.