The good and the bad system: does the test collection predict users' effectiveness?

  • Authors:
  • Azzah Al-Maskari;Mark Sanderson;Paul Clough;Eija Airio

  • Affiliations:
  • University of Sheffield, Sheffield, United Kngdm;University of Sheffield, Sheffield, United Kngdm;University of Sheffield, Sheffield, United Kngdm;University of Tampere, Tampere, Finland

  • Venue:
  • Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Test collections are extensively used in the evaluation of information retrieval systems. Crucial to their use is the degree to which results from them predict user effectiveness. At first, past studies did not substantiate a relationship between system and user effectiveness; more recently, however, correlations have begun to emerge. The results of this paper strengthen and extend those findings. We introduce a novel methodology for investigating the relationship, which shows great success in establishing a significant correlation between system and user effectiveness. It is shown that users behave differently and discern differences between pairs of systems that have a very small absolute difference in test collection effectiveness. Our results strengthen the use of test collections in IR evaluation, confirming that users' effectiveness can be predicted successfully.