Optimal IR: how far away?

Authors:
Xiangdong An;Xiangji Huang;and Nick Cercone
Affiliations:
Department of Computer Science and Engineering, York University, Toronto, ON, Canada;School of Information Technology, York University, Toronto, ON, Canada;Department of Computer Science and Engineering, York University, Toronto, ON, Canada
Venue:
CICLing'10 Proceedings of the 11th international conference on Computational Linguistics and Intelligent Text Processing
Year:
2010

Citing 5
Cited 0

Beyond independent relevance: methods and evaluation metrics for subtopic retrieval

Proceedings of the 26th annual international ACM SIGIR conference on Research and development in informaion retrieval
Document re-ranking based on automatically acquired key terms in Chinese information retrieval

COLING '04 Proceedings of the 20th international conference on Computational Linguistics
Novelty and topicality in interactive information retrieval

Journal of the American Society for Information Science and Technology
Novelty and diversity in information retrieval evaluation

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
A reranking model for genomics aspect search

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

There exists a gap between what a human user wants in mind and what (s)he could get from the information retrieval (IR) systems by his/her queries. We say an IR system is perfect if it could always provide the users with what they want in their minds if available in corpus, and optimal if it could present to the users what it finds in an optimal way. In this paper, we empirically study how far away we are still from the optimal IR or the perfect IR based on submitted runs to TREC Genomics track 2007. We assume perfect IR would always achieve a score of 100% for given evaluation methods. The optimal IR is simulated by optimized runs based on the evaluation methods provided by TREC. Then the average performance difference between submitted runs and the perfect or optimal runs can be obtained. Given annual average performance improvement made by reranking from literature, we figure out how far away we are from the optimal or the perfect IRs. The study indicates we are about 7 and 16 years away from the optimal and the perfect IRs, respectively. These are absolutely not exact distances, but they do give us a partial perspective regarding where we are in the IR development path. This study also provides us with the lowest upper bound on IR performance improvement by reranking.