Creating a test collection: relevance judgements of cited & non-cited papers

  • Authors:
  • Anna Ritchie;Stephen Robertson;Simone Teufel

  • Affiliations:
  • University of Cambridge, Cambridge;Microsoft Research Ltd, Cambridge;University of Cambridge, Cambridge

  • Venue:
  • Large Scale Semantic Access to Content (Text, Image, Video, and Sound)
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

We investigate the effect of different sources of relevant documents in the creation of a test collection in the scientific domain. Based on the Cranfield 2 design, paper authors are asked to judge their cited papers for relevance in the first stage. In a second stage, documents outside the reference list are judged. In this paper, we use the test collection with standard IR engines to compare the information contained in the judgements of the first vs second stage. Using different correlation studies, we found that the judgements of the cited papers do not predict those from the non-cited papers, which means that the combination of sources results in a higher quality collection.