On identifying representative relevant documents

  • Authors:
  • Fiana Raiber;Oren Kurland

  • Affiliations:
  • Technion - Israel Institute of Technology, Haifa, Israel;Technion - Israel Institute of Technology, Haifa, Israel

  • Venue:
  • CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Using relevance feedback can significantly improve the effectiveness of ad hoc (query-based) retrieval. However, retrieval performance can significantly vary with respect to the given set of relevant documents. Our goal is to establish a quantitative analysis of what makes a relevant document a good representative of the relevant-documents set regardless of the retrieval approach employed. That is, we would like to estimate the extent to which a relevant document can effectively help in finding (other) relevant documents using some relevance-feedback method employed over the corpus. We present various representativeness estimates; some of which treat documents independently and some utilize inter-document similarities. Empirical evaluation shows that relevant documents that are centrally located within the similarity space of the relevant-documents set tend to be good representatives. In addition, we show that there exist highly representative clusters of similar relevant documents, and devise methods for ranking clusters based on their presumed representativeness. Finally, we study the connection between representativeness and TREC's gradual relevance judgments.