Evaluation for operational IR applications: generalizability and automation

  • Authors:
  • Melanie Imhof;Martin Braschler;Preben Hansen;Stefan Rietberger

  • Affiliations:
  • Zurich University of Applied Science, Winterthur, Switzerland;Zurich University of Applied Science, Winterthur, Switzerland;Stockholm University, Stockholm, Sweden;Zurich University of Applied Science, Winterthur, Switzerland

  • Venue:
  • Proceedings of the 2013 workshop on Living labs for information retrieval evaluation
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Black box information retrieval (IR) application evaluation allows practitioners to measure the quality of their IR application. Instead of evaluating specific components, e.g. solely the search engine, a complete IR application, including the user's perspective, is evaluated. The evaluation methodology is designed to be applicable to operational IR applications. The black box evaluation methodology could be packaged into an evaluation and monitoring tool, making it usable for industry stakeholders. The tool should lead practitioners through the evaluation process and maintain the test results for the manual and automatic tests. This paper shows that the methodology is generalizable, even though the diversity of IR applications is high. The challenges in automating tests are the simulation of tasks that require intellectual effort and the handling of different visualizations of the same concept.