Quantifying query ambiguity

  • Authors:
  • Steve Cronen-Townsend;W. Bruce Croft

  • Affiliations:
  • University of Massachusetts, Amherst, MA;University of Massachusetts, Amherst, MA

  • Venue:
  • HLT '02 Proceedings of the second international conference on Human Language Technology Research
  • Year:
  • 2002

Quantified Score

Hi-index 0.00

Visualization

Abstract

We develop a measure of a query with respect to a collection of documents with the aim of quantifying the query's ambiguity with respect to those documents. This measure, the clarity score, is the relative entropy between a query language model and the corresponding collection language model. We substantiate that the clarity score measures the coherence and specificity of the language used in documents likely to satisfy the query. We also argue that it provides a suitable quantification of the (lack of) ambiguity of a query with respect to a collection of documents and has potential applications throughout the field of information retrieval. In particular, the clarity score is shown to correlate positively with average precision in evaluations using TREC test collections. Hence, as one example, the clarity score could serve as a predictor of query performance. Systems would then be able to identify vague information requests and respond differently than they would to clear and specific requests.