On improving pseudo-relevance feedback using pseudo-irrelevant documents

  • Authors:
  • Karthik Raman;Raghavendra Udupa;Pushpak Bhattacharya;Abhijit Bhole

  • Affiliations:
  • Indian Institute of Technology Bombay, Mumbai;Microsoft Research India, Bangalore;Indian Institute of Technology Bombay, Mumbai;Microsoft Research India, Bangalore

  • Venue:
  • ECIR'2010 Proceedings of the 32nd European conference on Advances in Information Retrieval
  • Year:
  • 2010

Quantified Score

Hi-index 0.01

Visualization

Abstract

Pseudo-Relevance Feedback (PRF) assumes that the top-ranking n documents of the initial retrieval are relevant and extracts expansion terms from them. In this work, we introduce the notion of pseudo-irrelevant documents, i.e. high-scoring documents outside of top n that are highly unlikely to be relevant. We show how pseudo-irrelevant documents can be used to extract better expansion terms from the top-ranking n documents: good expansion terms are those which discriminate the top-ranking n documents from the pseudo-irrelevant documents. Our approach gives substantial improvements in retrieval performance over Model-based Feedback on several test collections.