Asking what no one has asked before: using phrase similarities to generate synthetic web search queries

  • Authors:
  • Marius Pasca

  • Affiliations:
  • Google Inc., Mountain View, CA, USA

  • Venue:
  • Proceedings of the 20th ACM international conference on Information and knowledge management
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper introduces a method for automatically inferring meaningful, not-yet-submitted queries. The inferred queries fill some of the knowledge gaps between documents, on one hand, and known (i.e., already-submitted) queries, on the other hand. Thus, the inferred queries expand query logs and increase their coverage. New candidate queries are over-generated from known queries via phrase similarity data, then filtered against the set of known queries. The accuracy of the generated queries is computed using open-domain questions from standard question answering evaluation sets. Over the ranked lists of questions inferred for each of the evaluation questions, the precision reaches 0.9 at rank 50. The set of inferred queries is more than twice as large as the set of input queries.