Improving verbose queries using subset distribution

  • Authors:
  • Xiaobing Xue;Samuel Huston;W. Bruce Croft

  • Affiliations:
  • University of Massachusetts, Amherst, Amherst, MA, USA;University of Massachusetts, Amherst, Amherst, MA, USA;University of Massachusetts, Amherst, Amherst, MA, USA

  • Venue:
  • CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Dealing with verbose (or long) queries poses a new challenge for information retrieval. Selecting a subset of the original query (a "sub-query") has been shown to be an effective method for improving these queries. In this paper, the distribution of sub-queries ("subset distribution") is formally modeled within a well-grounded framework. Specifically, sub-query selection is considered as a sequential labeling problem, where each query word in a verbose query is assigned a label of "keep" or "don't keep". A novel Conditional Random Field model is proposed to generate the distribution of sub-queries. This model captures the local and global dependencies between query words and directly optimizes the expected retrieval performance on a training set. The experiments, based on different retrieval models and performance measures, show that the proposed model can generate high-quality sub-query distributions and can significantly outperform state-of-the-art techniques.