Evaluation by comparing result sets in context

  • Authors:
  • Paul Thomas;David Hawking

  • Affiliations:
  • Australian National University, Canberra, Australia;CSIRO ICT Centre, Canberra, Australia

  • Venue:
  • CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Familiar evaluation methodologies for information retrieval (IR) are not well suited to the task of comparing systems in many real settings. These systems and evaluation methods must support contextual, interactive retrieval over changing, heterogeneous data collections, including private and confidential information.We have implemented a comparison tool which can be inserted into the natural IR process. It provides a familiar search interface, presents a small number of result sets in side-by-side panels, elicits searcher judgments, and logs interaction events. The tool permits study of real information needs as they occur, uses the documents actually available at the time of the search, and records judgments taking into account the instantaneous needs of the searcher.We have validated our proposed evaluation approach and explored potential biases by comparing different whole-of-Web search facilities using a Web-based version of the tool. In four experiments, one with supplied queries in the laboratory and three with real queries in the workplace, subjects showed no discernable left-right bias and were able to reliably distinguish between high- and low-quality result sets. We found that judgments were strongly predicted by simple implicit measures.Following validation we undertook a case study comparing two leading whole-of-Web search engines. The approach is now being used in several ongoing investigations.