IR between science and engineering, and the role of experimentation

  • Authors:
  • Norbert Fuhr

  • Affiliations:
  • Department of Computer Science and Applied Cognitive Science, Faculty of Engineering, University of Duisburg-Essen, Duisburg, Germany

  • Venue:
  • CLEF'10 Proceedings of the 2010 international conference on Multilingual and multimodal information access evaluation: cross-language evaluation forum
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Evaluation has always played a major role in IR research, as a means for judging about the quality of competing models. Lately, however, we have seen an over-emphasis of experimental results, thus favoring engineering approaches aiming at tuning performance and neglecting other scientific criteria. A recent study investigated the validity of experimental results published at major conferences, showing that for 95% of the papers using standard test collections, the claimed improvements were only relative, and the resulting quality was inferior to that of the top performing systems [AMWZ09]. In this talk, it is claimed that IR is still in its scientific infancy. Despite the extensive efforts in evaluation initiatives, the scientific insights gained are still very limited - partly due to shortcomings in the design of the testbeds. From a general scientific standpoint, using test collections for evaluation only is a waste of resources. Instead, experimentation should be used for hypothesis generation and testing in general, in order to accumulate a better understanding of the retrieval process and to develop a broader theoretic foundation for the field.