Evaluation for operational IR applications: generalizability and automation

Authors:
Melanie Imhof;Martin Braschler;Preben Hansen;Stefan Rietberger
Affiliations:
Zurich University of Applied Science, Winterthur, Switzerland;Zurich University of Applied Science, Winterthur, Switzerland;Stockholm University, Stockholm, Sweden;Zurich University of Applied Science, Winterthur, Switzerland
Venue:
Proceedings of the 2013 workshop on Living labs for information retrieval evaluation
Year:
2013

Citing 2
Cited 0

The Philosophy of Information Retrieval Evaluation

CLEF '01 Revised Papers from the Second Workshop of the Cross-Language Evaluation Forum on Evaluation of Cross-Language Information Retrieval Systems
The Turn: Integration of Information Seeking and Retrieval in Context (The Information Retrieval Series)

The Turn: Integration of Information Seeking and Retrieval in Context (The Information Retrieval Series)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Black box information retrieval (IR) application evaluation allows practitioners to measure the quality of their IR application. Instead of evaluating specific components, e.g. solely the search engine, a complete IR application, including the user's perspective, is evaluated. The evaluation methodology is designed to be applicable to operational IR applications. The black box evaluation methodology could be packaged into an evaluation and monitoring tool, making it usable for industry stakeholders. The tool should lead practitioners through the evaluation process and maintain the test results for the manual and automatic tests. This paper shows that the methodology is generalizable, even though the diversity of IR applications is high. The challenges in automating tests are the simulation of tasks that require intellectual effort and the handling of different visualizations of the same concept.