On Collection Size and Retrieval Effectiveness

Authors:
David Hawking;Stephen Robertson
Affiliations:
CSIRO Mathematical and Information Sciences, Canberra, Australia. david.hawking@csiro.au;Microsoft Research, Cambridge, UK. ser@microsoft.com
Venue:
Information Retrieval
Year:
2003

Citing 0
Cited 26

Coverage, relevance, and ranking: The impact of query operators on Web search engine results

ACM Transactions on Information Systems (TOIS)
Recommended reading for IR research students

ACM SIGIR Forum
Extreme value theory applied to document retrieval from large collections

Information Retrieval
The TREC 2005 robust track

ACM SIGIR Forum
Estimating average precision with incomplete and imperfect judgments

CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
A machine learning based approach to evaluating retrieval systems

HLT-NAACL '06 Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics
Efficient query expansion with auxiliary data structures

Information Systems
Result merging methods in distributed information retrieval with overlapping databases

Information Retrieval
A syntactically-based query reformulation technique for information retrieval

Information Processing and Management: an International Journal
Tagging and searching: Search retrieval effectiveness of folksonomies on the World Wide Web

Information Processing and Management: an International Journal
Search effectiveness with a breadth-first crawl

Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval
Estimating average precision when judgments are incomplete

Knowledge and Information Systems
Local search: A guide for the information retrieval practitioner

Information Processing and Management: an International Journal
Measuring the Search Effectiveness of a Breadth-First Crawl

ECIR '09 Proceedings of the 31th European Conference on IR Research on Advances in Information Retrieval
SUSHI: scoring scaled samples for server selection

Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
A signal-to-noise approach to score normalization

Proceedings of the 18th ACM conference on Information and knowledge management
On score distributions and relevance

ECIR'07 Proceedings of the 29th European conference on IR research
The importance of anchor text for ad hoc search revisited

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
The impact of collection size on relevance and diversity

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Modeling score distributions in information retrieval

Information Retrieval
Efficient and effective spam filtering and re-ranking for large web datasets

Information Retrieval
On effectiveness measures and relevance functions in ranking INEX systems

AIRS'05 Proceedings of the Second Asia conference on Asia Information Retrieval Technology
Scalability influence on retrieval models: an experimental methodology

ECIR'05 Proceedings of the 27th European conference on Advances in Information Retrieval Research
On per-topic variance in IR evaluation

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Modelling Score Distributions Without Actual Scores

Proceedings of the 2013 Conference on the Theory of Information Retrieval
Document Score Distribution Models for Query Performance Inference and Prediction

ACM Transactions on Information Systems (TOIS)

Quantified Score

Hi-index	0.01

Visualization

Abstract

The relationship between collection size and retrieval effectiveness is particularly important in the context of Web search. We investigate it first analytically and then experimentally, using samples and subsets of test collections. Different retrieval systems vary in how the score assigned to an individual document in a sample collection relates to the score it receives in the full collection; we identify four cases.We apply signal detection (SD) theory to retrieval from samples, taking into account the four cases and using a variety of shapes for relevant and irrelevant distributions. We note that the SD model subsumes several earlier hypotheses about the causes of the decreased precision in samples. We also discuss other models which contribute to an understanding of the phenomenon, particularly relating to the effects of discreteness. Different models provide complementary insights.Extensive use is made of test data, some from official submissions to the TREC-6 VLC track and some new, to illustrate the effects and test hypotheses. We empirically confirm predictions, based on SD theory, that P@n should decline when moving to a sample collection and that average precision and R-precision should remain constant. SD theory suggests the use of recall-fallout plots as operating characteristic (OC) curves. We plot OC curves of this type for a real retrieval system and query set and show that curves for sample collections are similar but not identical to the curve for the full collection.