Content-based relevance estimation on the web using inter-document similarities

  • Authors:
  • Fiana Raiber;Oren Kurland;Moshe Tennenholtz

  • Affiliations:
  • Technion, Haifa, Israel;Technion, Haifa, Israel;Microsoft Research and Technion, Haifa, Israel

  • Venue:
  • Proceedings of the 21st ACM international conference on Information and knowledge management
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

In adversarial and noisy search settings as the Web, the document-query surface level similarity can be a highly misleading relevance signal. Thus, devising content-based relevance estimation (ranking) approaches becomes highly challenging. We address this challenge using two methods that utilize inter-document similarities in an initially retrieved list. The first removes documents from the list that exhibit high query similarity, but for which there is insufficient additional support for relevance that is based on inter-document similarities. The method is based on a probabilistic model that decouples document-query similarities from relevance estimation. The second method re-ranks the list by "rewarding" documents that exhibit high similarity both to the query and to other documents in the list. Both methods incorporate, in addition, at the model level, query-independent document quality estimates. Extensive empirical evaluation demonstrates the merits of our methods.