Programming perl
Document filtering for fast ranking
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Optimization of inverted vector searches
SIGIR '85 Proceedings of the 8th annual international ACM SIGIR conference on Research and development in information retrieval
Introduction to Modern Information Retrieval
Introduction to Modern Information Retrieval
Scalable Text Retrieval for Large Digital Libraries
ECDL '97 Proceedings of the First European Conference on Research and Advanced Technology for Digital Libraries
Measuring Search Engine Quality
Information Retrieval
Construction of query concepts based on feature clustering of documents
Information Retrieval
Extreme value theory applied to document retrieval from large collections
Information Retrieval
Multilingual PRF: english lends a helping hand
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Multilingual pseudo-relevance feedback: performance study of assisting languages
ACL '10 Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics
Scalability influence on retrieval models: an experimental methodology
ECIR'05 Proceedings of the 27th European conference on Advances in Information Retrieval Research
Hi-index | 0.00 |
Due to the popularity of Web search engines, a large proportion ofreal text retrieval queries are now processed over collections measured in tens or hundredsof gigabytes. A new Very Large test Collection (VLC) has been created tosupport qualification, measurement and comparison of systems operatingat this level and to permit the study of the properties of very largecollections. The VLC is an extension of the well-known TRECcollection and has been distributed under the same conditions.A simple set of efficiency and effectiveness measures have been defined to encourage comparability of reporting.The 20 gigabyte first-edition of the VLC and a representative 10%sample have been used in a special interest track of the 1997 TextRetrieval Conference (TREC-6).The unaffordable cost of obtaining complete relevanceassessments over collections of this scale is avoided by concentratingon early precision and relying on the core TREC collection to supportdetailed effectiveness studies.Results obtained by TREC-6 VLC track participants are presented here.All groups observed a significant increase in early precision as collection size increased. Explanatory hypotheses are advanced forfuture empirical testing. A 100 gigabyte second edition VLC (VLC2) has recently been compiledand distributed for use in TREC-7 in 1998.