Relating retrievability, performance and length

Authors:
Colin Wilkie;Leif Azzopardi
Affiliations:
University of Glasgow, Glasgow, Scotland Uk;University of Glasgow, Glasgow, Scotland Uk
Venue:
Proceedings of the 36th international ACM SIGIR conference on Research and development in information retrieval
Year:
2013

Citing 12
Cited 0

Pivoted document length normalization

SIGIR '96 Proceedings of the 19th annual international ACM SIGIR conference on Research and development in information retrieval
Information Retrieval

Information Retrieval
Revisiting the relationship between document length and relevance

Proceedings of the 17th ACM conference on Information and knowledge management
Retrievability: an evaluation measure for higher order information access tasks

Proceedings of the 17th ACM conference on Information and knowledge management
Search engine predilection towards news media providers

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
Document-Oriented Pruning of the Inverted Index in Information Retrieval Systems

WAINA '09 Proceedings of the 2009 International Conference on Advanced Information Networking and Applications Workshops
Improving retrievability of patents with cluster-based pseudo-relevance feedback documents selection

Proceedings of the 18th ACM conference on Information and knowledge management
Tagging and navigability

Proceedings of the 19th international conference on World wide web
On the relationship between effectiveness and accessibility

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Reverted indexing for feedback and expansion

CIKM '10 Proceedings of the 19th ACM international conference on Information and knowledge management
Improving retrievability and recall by automatic corpus partitioning

Transactions on large-scale data- and knowledge-centered systems II
Improving retrievability of patents in prior-art search

ECIR'2010 Proceedings of the 32nd European conference on Advances in Information Retrieval

Quantified Score

Hi-index	0.00

Visualization

Abstract

Retrievability provides a different way to evaluate an Information Retrieval (IR) system as it focuses on how easily documents can be found. It is intrinsically related to retrieval performance because a document needs to be retrieved before it can be judged relevant. In this paper, we undertake an empirical investigation into the relationship between the retrievability of documents, the retrieval bias imposed by a retrieval system, and the retrieval performance, across different amounts of document length normalization. To this end, two standard IR models are used on three TREC test collections to show that there is a useful and practical link between retrievability and performance. Our findings show that minimizing the bias across the document collection leads to good performance (though not the best performance possible). We also show that past a certain amount of document length normalization the retrieval bias increases, and the retrieval performance significantly and rapidly decreases. These findings suggest that the relationship between retrievability and effectiveness may offer a way to automatically tune systems.