Fast construction of the HYB index

  • Authors:
  • Hannah Bast;Marjan Celikik

  • Affiliations:
  • Albert Ludwigs University, Albert Ludwigs University;Albert Ludwigs University, Albert Ludwigs University

  • Venue:
  • ACM Transactions on Information Systems (TOIS)
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

As shown in a series of recent works, the HYB index is an alternative to the inverted index (INV) that enables very fast prefix searches, which in turn is the basis for fast processing of many other types of advanced queries, including autocompletion, faceted search, error-tolerant search, database-style select and join, and semantic search. In this work we show that HYB can be constructed at least as fast as INV, and often up to twice as fast. This is because HYB, by its nature, requires only a half-inversion of the data and allows an efficient in-place instead of the traditional merge-based index construction. We also pay particular attention to the cache efficiency of the in-memory posting accumulation, an issue that has not been addressed in previous work, and show that our simple multilevel posting accumulation scheme yields much fewer cache misses compared to related approaches. Finally, we show that HYB supports fast dynamic index updates more easily than INV.