Generating suggestions for queries in the long tail with an inverted index

  • Authors:
  • Daniele Broccolo;Lorenzo Marcon;Franco Maria Nardini;Raffaele Perego;Fabrizio Silvestri

  • Affiliations:
  • Istituto di Scienza e Tecnologie dell'Informazione "A. Faedo", CNR, Pisa, Italy and Universití "Ca' Foscari" Venezia, Italy;Istituto di Scienza e Tecnologie dell'Informazione "A. Faedo", CNR, Pisa, Italy;Istituto di Scienza e Tecnologie dell'Informazione "A. Faedo", CNR, Pisa, Italy;Istituto di Scienza e Tecnologie dell'Informazione "A. Faedo", CNR, Pisa, Italy;Istituto di Scienza e Tecnologie dell'Informazione "A. Faedo", CNR, Pisa, Italy

  • Venue:
  • Information Processing and Management: an International Journal
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper proposes an efficient and effective solution to the problem of choosing the queries to suggest to web search engine users in order to help them in rapidly satisfying their information needs. By exploiting a weak function for assessing the similarity between the current query and the knowledge base built from historical users' sessions, we re-conduct the suggestion generation phase to the processing of a full-text query over an inverted index. The resulting query recommendation technique is very efficient and scalable, and is less affected by the data-sparsity problem than most state-of-the-art proposals. Thus, it is particularly effective in generating suggestions for rare queries occurring in the long tail of the query popularity distribution. The quality of suggestions generated is assessed by evaluating the effectiveness in forecasting the users' behavior recorded in historical query logs, and on the basis of the results of a reproducible user study conducted on publicly-available, human-assessed data. The experimental evaluation conducted shows that our proposal remarkably outperforms two other state-of-the-art solutions, and that it can generate useful suggestions even for rare and never seen queries.