The effect of topic set size on retrieval experiment error
SIGIR '02 Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
Retrieval evaluation with incomplete information
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
The NRRC reliable information access (RIA) workshop
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
The effect of named entities on effectiveness in cross-language information retrieval evaluation
Proceedings of the 2005 ACM symposium on Applied computing
The TREC robust retrieval track
ACM SIGIR Forum
Information retrieval system evaluation: effort, sensitivity, and reliability
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Minimal test collections for retrieval evaluation
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
On GMAP: and other transformations
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Why do successful search systems fail for some topics
Proceedings of the 2007 ACM symposium on Applied computing
Strategic system comparisons via targeted relevance judgments
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
CLEF 2006: ad hoc track overview
CLEF'06 Proceedings of the 7th international conference on Cross-Language Evaluation Forum: evaluation of multilingual and multi-modal information retrieval
Score aggregation techniques in retrieval experimentation
ADC '09 Proceedings of the Twentieth Australasian Conference on Australasian Database - Volume 92
CLEF 2009 ad hoc track overview: robust-WSD task
CLEF'09 Proceedings of the 10th cross-language evaluation forum conference on Multilingual information access evaluation: text retrieval experiments
Hi-index | 0.01 |
The results of information retrieval evaluations are often difficult to apply to practical challenges. Recent research interest in the robustness of information systems tries to facilitate the application of research results for practical environments. This paper analyzes a large amount of evaluation experiments from the Cross Language Evaluation Forum (CLEF). Robustness can be interpreted as stressing the importance of difficult topics and is usually measured with the geometric mean of the topic results. Our analysis shows that a small decrease of performance of bi-and multi-lingual retrieval goes along with a tremendous difference between the geometric mean and the average of topics. Consequently, robustness is an important issue especially for cross-language retrieval system evaluation.