Sampling precision to depth 10000 at CLEF 2008

  • Authors:
  • Stephen Tomlinson

  • Affiliations:
  • Open Text Corporation, Ottawa, Ontario, Canada

  • Venue:
  • CLEF'08 Proceedings of the 9th Cross-language evaluation forum conference on Evaluating systems for multilingual and multimodal information access
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

We conducted an experiment to test the completeness of the relevance judgments for the monolingual German, French, English and Persian (Farsi) information retrieval tasks of the Ad Hoc Track of the Cross-Language Evaluation Forum (CLEF) 2008. In the ad hoc retrieval tasks, the system was given 50 natural language queries, and the goal was to find all of the relevant documents (with high precision) in a particular document set. For each language, we submitted a sample of the first 10000 retrieved items to investigate the frequency of relevant items at deeper ranks than the official judging depth (of 60). The results suggest that, on average, the percentage of relevant items assessed was less than 55% for German, French and English and less than 25% for Persian.