Pivoted document length normalization
SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Information Retrieval
Revisiting the relationship between document length and relevance
Proceedings of the 17th ACM conference on Information and knowledge management
Retrievability: an evaluation measure for higher order information access tasks
Proceedings of the 17th ACM conference on Information and knowledge management
Search engine predilection towards news media providers
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Document-Oriented Pruning of the Inverted Index in Information Retrieval Systems
WAINA '09 Proceedings of the 2009 International Conference on Advanced Information Networking and Applications Workshops
Improving retrievability of patents with cluster-based pseudo-relevance feedback documents selection
Proceedings of the 18th ACM conference on Information and knowledge management
Proceedings of the 19th international conference on World wide web
On the relationship between effectiveness and accessibility
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Reverted indexing for feedback and expansion
CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Improving retrievability and recall by automatic corpus partitioning
Transactions on large-scale data- and knowledge-centered systems II
Improving retrievability of patents in prior-art search
ECIR'2010 Proceedings of the 32nd European conference on Advances in Information Retrieval
Hi-index | 0.00 |
Retrievability provides a different way to evaluate an Information Retrieval (IR) system as it focuses on how easily documents can be found. It is intrinsically related to retrieval performance because a document needs to be retrieved before it can be judged relevant. In this paper, we undertake an empirical investigation into the relationship between the retrievability of documents, the retrieval bias imposed by a retrieval system, and the retrieval performance, across different amounts of document length normalization. To this end, two standard IR models are used on three TREC test collections to show that there is a useful and practical link between retrievability and performance. Our findings show that minimizing the bias across the document collection leads to good performance (though not the best performance possible). We also show that past a certain amount of document length normalization the retrieval bias increases, and the retrieval performance significantly and rapidly decreases. These findings suggest that the relationship between retrievability and effectiveness may offer a way to automatically tune systems.