Research methodology in studies of assessor effort for information retrieval evaluation

  • Authors:
  • Ben Carterette;James Allan

  • Affiliations:
  • University of Massachusetts Amherst, Amherst, MA;University of Massachusetts Amherst, Amherst, MA

  • Venue:
  • Large Scale Semantic Access to Content (Text, Image, Video, and Sound)
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

As evaluation is an important but difficult part of information retrieval system design and experimentation, evaluation questions have been the subject of much research. An "evaluation study" is an investigation into some aspect of evaluation. These types of studies typically experiment on ranked results from actual retrieval systems, most often those that were submitted to TREC tracks. We argue that the standard of evidence in these types of studies should be increased to the level required of text retrieval studies, by testing on multiple data sets, multiple subsets of data, and comparison to baselines using hypothesis testing. We demonstrate that baseline performance on the standard data sets is quite high, necessitating strong evidence to support claims.