Using parsimonious language models on web data

  • Authors:
  • Rianne Kaptein;Rongmei LI;Djoerd Hiemstra;Jaap Kamps

  • Affiliations:
  • University of Amsterdam, Amsterdam, Netherlands;University of Twente, Enschede, Netherlands;University of Twente, Enschede, Netherlands;University of Amsterdam, Amsterdam, Netherlands

  • Venue:
  • Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we explore the use of parsimonious language models for web retrieval. These models are smaller thus more efficient than the standard language models and are therefore well suited for large-scale web retrieval. We have conducted experiments on four TREC topic sets, and found that the parsimonious language model results in improvement of retrieval effectiveness over the standard language model for all data-sets and measures. In all cases the improvement is significant, and more substantial than in earlier experiments on newspaper/newswire data.