Has adhoc retrieval improved since 1994?

Authors:
Timothy G. Armstrong;Alistair Moffat;William Webber;Justin Zobel
Affiliations:
The University of Melbourne, Melbourne, Australia;The University of Melbourne, Melbourne, Australia;The University of Melbourne, Melbourne, Australia;The University of Melbourne, Melbourne, Australia
Venue:
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Year:
2009

Citing 2
Cited 4

TREC: Experiment and Evaluation in Information Retrieval (Digital Libraries and Electronic Publishing)

TREC: Experiment and Evaluation in Information Retrieval (Digital Libraries and Electronic Publishing)
Score standardization for inter-collection comparison of retrieval systems

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval

Improvements that don't add up: ad-hoc retrieval results since 1998

Proceedings of the 18th ACM conference on Information and knowledge management
The challenge of high recall in biomedical systematic search

Proceedings of the third international workshop on Data and text mining in bioinformatics
Presenting query aspects to support exploratory search

AUIC '10 Proceedings of the Eleventh Australasian Conference on User Interface - Volume 106
Ad hoc IR: not much room for improvement

Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval

Quantified Score

Hi-index	0.01

Visualization

Abstract

Evaluation forums such as TREC allow systematic measurement and comparison of information retrieval techniques. The goal is consistent improvement, based on reliable comparison of the effectiveness of different approaches and systems. In this paper we report experiments to determine whether this goal has been achieved. We ran five publicly available search systems, in a total of seventeen different configurations, against nine TREC adhoc-style collections, spanning 1994 to 2005. These runsets were then used as a benchmark for reassessing the relative effectiveness of the original TREC runs for those collections. Surprisingly, there appears to have been no overall improvement in effectiveness for either median or top-end TREC submissions, even after allowing for several possible confounds. We therefore question whether the effectiveness of adhoc information retrieval has improved over the past decade and a half.