Proximity-based rocchio's model for pseudo relevance

  • Authors:
  • Jun Miao;Jimmy Xiangji Huang;Zheng Ye

  • Affiliations:
  • York University, Toronto, ON, Canada;York University, Toronto, ON, Canada;York University, Toronto, ON, Canada

  • Venue:
  • SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Rocchio's relevance feedback model is a classic query expansion method and it has been shown to be effective in boosting information retrieval performance. The selection of expansion terms in this method, however, does not take into account the relationship between the candidate terms and the query terms (e.g., term proximity). Intuitively, the proximity between candidate expansion terms and query terms can be exploited in the process of query expansion, since terms closer to query terms are more likely to be related to the query topic. In this paper, we study how to incorporate proximity information into the Rocchio's model, and propose a proximity-based Rocchio's model, called PRoc, with three variants. In our PRoc models, a new concept (proximity-based term frequency, ptf) is introduced to model the proximity information in the pseudo relevant documents, which is then used in three kinds of proximity measures. Experimental results on TREC collections show that our proposed PRoc models are effective and generally superior to the state-of-the-art relevance feedback models with optimal parameters.A direct comparison with positional relevance model (PRM) on the GOV2 collection also indicates our proposed model is at least competitive to the most recent progress.