Out of the Box Phrase Indexing

  • Authors:
  • Frederik Transier;Peter Sanders

  • Affiliations:
  • SAP NetWeaver EIM TREX, SAP AG, Walldorf, Germany and University of Karlsruhe, Karlsruhe, Germany;University of Karlsruhe, Karlsruhe, Germany

  • Venue:
  • SPIRE '08 Proceedings of the 15th International Symposium on String Processing and Information Retrieval
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present a method for optimizing phrase search based on inverted indexes. Our approach adds selected (two-term) phrases to an existing index. Whereas competing approaches are often based on the analysis of query logs, our approach works out of the box and uses only the information contained in the index. Also, our method is competitive in terms of query performance and can even improve on other approaches for difficult queries. Moreover, our approach gives performance guarantees for arbitrary queries. Further, we propose using a phrase index as a substitute for the positional index of an in-memory search engine working with short documents. We support our conclusions with experiments using a high-performance main-memory search engine. We also give evidence that classical disk based systems can profit from our approach.