Parsimonious language models for information retrieval

  • Authors:
  • Djoerd Hiemstra;Stephen Robertson;Hugo Zaragoza

  • Affiliations:
  • University of Twente, Enschede, The Netherlands;Mircrosoft Research, Cambridge, U.K.;Mircrosoft Research, Cambridge, U.K.

  • Venue:
  • Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

We systematically investigate a new approach to estimating the parameters of language models for information retrieval, called parsimonious language models. Parsimonious language models explicitly address the relation between levels of language models that are typically used for smoothing. As such, they need fewer (non-zero) parameters to describe the data. We apply parsimonious models at three stages of the retrieval process: 1) at indexing time; 2) at search time; 3) at feedback time. Experimental results show that we are able to build models that are significantly smaller than standard models, but that still perform at least as well as the standard approaches.