CRTER: using cross terms to enhance probabilistic information retrieval

  • Authors:
  • Jiashu Zhao;Jimmy Xiangji Huang;Ben He

  • Affiliations:
  • York University, Toronto, ON, Canada;York University, Toronto, ON, Canada;York University, Toronto, ON, Canada

  • Venue:
  • Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Term proximity retrieval rewards a document where the matched query terms occur close to each other. Although term proximity is known to be effective in many Information Retrieval (IR) applications, the within-document distribution of each individual query term and how the query terms associate with each other, are not fully considered. In this paper, we introduce a pseudo term, namely Cross Term, to model term proximity for boosting retrieval performance. An occurrence of a query term is assumed to have an impact towards its neighboring text, which gradually weakens with the increase of the distance to the place of occurrence. We use a shape function to characterize such an impact. A Cross Term occurs when two query terms appear close to each other and their impact shape functions have an intersection. We propose a Cross Term Retrieval (CRTER) model that combines the Cross Terms' information with basic probabilistic weighting models to rank the retrieved documents. Extensive experiments on standard TREC collections illustrate the effectiveness of our proposed CRTER model.