Modeling term associations for ad-hoc retrieval performance within language modeling framework

  • Authors:
  • Xing Wei;W. Bruce Croft

  • Affiliations:
  • Center for Intelligent Information Retrieval, University of Massachusetts Amherst, Amherst, MA;Center for Intelligent Information Retrieval, University of Massachusetts Amherst, Amherst, MA

  • Venue:
  • ECIR'07 Proceedings of the 29th European conference on IR research
  • Year:
  • 2007

Quantified Score

Hi-index 0.01

Visualization

Abstract

Previous research has shown that using term associations could improve the effectiveness of information retrieval (IR) systems. However, most of the existing approaches focus on query reformulation. Document reformulation has just begun to be studied recently. In this paper, we study how to utilize term association measures to do document modeling, and what types of measures are effective in document language models. We propose a probabilistic term association measure, compare it to some traditional methods, such as the similarity co-efficient and window-based methods, in the language modeling (LM) framework, and show that significant improvements over query likelihood (QL) retrieval can be obtained. We also compare the method with state-of-the-art document modeling techniques based on latent mixture models.