Language model information retrieval with document expansion

  • Authors:
  • Tao Tao;Xuanhui Wang;Qiaozhu Mei;ChengXiang Zhai

  • Affiliations:
  • University of Illinois at Urbana, Champaign;University of Illinois at Urbana, Champaign;University of Illinois at Urbana, Champaign;University of Illinois at Urbana, Champaign

  • Venue:
  • HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Language model information retrieval depends on accurate estimation of document models. In this paper, we propose a document expansion technique to deal with the problem of insufficient sampling of documents. We construct a probabilistic neighborhood for each document, and expand the document with its neighborhood information. The expanded document provides a more accurate estimation of the document model, thus improves retrieval accuracy. Moreover, since document expansion and pseudo feedback exploit different corpus structures, they can be combined to further improve performance. The experiment results on several different data sets demonstrate the effectiveness of the proposed document expansion method.