If I Had a Million Queries

  • Authors:
  • Ben Carterette;Virgil Pavlu;Evangelos Kanoulas;Javed A. Aslam;James Allan

  • Affiliations:
  • Dept. of Computer and Info. Sciences, University of Delaware, Newark,;College of Computer and Info. Science, Northeastern University, Boston,;College of Computer and Info. Science, Northeastern University, Boston,;College of Computer and Info. Science, Northeastern University, Boston,;Dept. of Computer Science, University of Massachusetts Amherst, Amherst,

  • Venue:
  • ECIR '09 Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

As document collections grow larger, the information needs and relevance judgments in a test collection must be well-chosen within a limited budget to give the most reliable and robust evaluation results. In this work we analyze a sample of queries categorized by length and corpus-appropriateness to determine the right proportion needed to distinguish between systems. We also analyze the appropriate division of labor between developing topics and making relevance judgments, and show that only a small, biased sample of queries with sparse judgments is needed to produce the same results as a much larger sample of queries.