Optimal IR: how far away?

  • Authors:
  • Xiangdong An;Xiangji Huang;and Nick Cercone

  • Affiliations:
  • Department of Computer Science and Engineering, York University, Toronto, ON, Canada;School of Information Technology, York University, Toronto, ON, Canada;Department of Computer Science and Engineering, York University, Toronto, ON, Canada

  • Venue:
  • CICLing'10 Proceedings of the 11th international conference on Computational Linguistics and Intelligent Text Processing
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

There exists a gap between what a human user wants in mind and what (s)he could get from the information retrieval (IR) systems by his/her queries. We say an IR system is perfect if it could always provide the users with what they want in their minds if available in corpus, and optimal if it could present to the users what it finds in an optimal way. In this paper, we empirically study how far away we are still from the optimal IR or the perfect IR based on submitted runs to TREC Genomics track 2007. We assume perfect IR would always achieve a score of 100% for given evaluation methods. The optimal IR is simulated by optimized runs based on the evaluation methods provided by TREC. Then the average performance difference between submitted runs and the perfect or optimal runs can be obtained. Given annual average performance improvement made by reranking from literature, we figure out how far away we are from the optimal or the perfect IRs. The study indicates we are about 7 and 16 years away from the optimal and the perfect IRs, respectively. These are absolutely not exact distances, but they do give us a partial perspective regarding where we are in the IR development path. This study also provides us with the lowest upper bound on IR performance improvement by reranking.