On bias problem in relevance feedback

  • Authors:
  • Qianli Xing;Yi Zhang;Lanbo Zhang

  • Affiliations:
  • Tsinghua University, Beijing, China;University of California, Santa Cruz, Santa Cruz, CA, USA;University of California, Santa Cruz, Santa Cruz, CA, USA

  • Venue:
  • Proceedings of the 20th ACM international conference on Information and knowledge management
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

Relevance feedback is an effective approach to improve retrieval quality over the initial query. Typical relevance feedback methods usually select top-ranked documents for relevance judgments, then query expansion or model updating are carried out based on the feedback documents. However, the number of feedback documents is usually limited due to expensive human labeling. Thus relevant documents in the feedback set are hardly representative of all relevant documents and the feedback set is actually biased. As a result, the performance of relevance feedback will get hurt. In this paper, we first show how and where the bias problem exists through experiments. Then we study how the bias can be reduced by utilizing the unlabeled documents. After analyzing the usefulness of a document to relevance feedback, we propose an approach that extends the feedback set with carefully selected unlabeled documents by heuristics. Our experiment results show that the extended feedback set has less bias than the original feedback set and better performance can be achieved when the extended feedback set is used for relevance feedback.