Discriminative probabilistic models for passage based retrieval

  • Authors:
  • Mengqiu Wang;Luo Si

  • Affiliations:
  • Stanford University, Stanford, CA, USA;Purdue University, West Lafayette, IN, USA

  • Venue:
  • Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

The approach of using passage-level evidence for document retrieval has shown mixed results when it is applied to a variety of test beds with different characteristics. One main reason of the inconsistent performance is that there exists no unified framework to model the evidence of individual passages within a document. This paper proposes two probabilistic models to formally model the evidence of a set of top ranked passages in a document. The first probabilistic model follows the retrieval criterion that a document is relevant if any passage in the document is relevant, and models each passage independently. The second probabilistic model goes a step further and incorporates the similarity correlations among the passages. Both models are trained in a discriminative manner. Furthermore, we present a combination approach to combine the ranked lists of document retrieval and passage-based retrieval. An extensive set of experiments have been conducted on four different TREC test beds to show the effectiveness of the proposed discriminative probabilistic models for passage-based retrieval. The proposed algorithms are compared with a state-of-the-art document retrieval algorithm and a language model approach for passage-based retrieval. Furthermore, our combined approach has been shown to provide better results than both document retrieval and passage-based retrieval approaches.