Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Effective retrieval with distributed collections
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
ACM Transactions on Internet Technology (TOIT)
Building a distributed full-text index for the web
ACM Transactions on Information Systems (TOIS)
Information Retrieval: Computational and Theoretical Aspects
Information Retrieval: Computational and Theoretical Aspects
Performance of Inverted Indices in Distributed Text Document Retrieval Systems
PDIS '93 Proceedings of the 2nd International Conference on Parallel and Distributed Information Systems
Dynamic maintenance of web indexes using landmarks
WWW '03 Proceedings of the 12th international conference on World Wide Web
Information-theoretic co-clustering
Proceedings of the ninth ACM SIGKDD international conference on Knowledge discovery and data mining
Operational requirements for scalable search systems
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
In-place versus re-build versus re-merge: index maintenance strategies for text retrieval systems
ACSC '04 Proceedings of the 27th Australasian conference on Computer science - Volume 26
Semantic Small World: An Overlay Network for Peer-to-Peer Search
ICNP '04 Proceedings of the 12th IEEE International Conference on Network Protocols
A statistics-based approach to incrementally update inverted files
Information Processing and Management: an International Journal
Query-driven document partitioning and collection selection
InfoScale '06 Proceedings of the 1st international conference on Scalable information systems
The query-vector document model
CIKM '06 Proceedings of the 15th ACM international conference on Information and knowledge management
Analyzing imbalance among homogeneous index servers in a web search system
Information Processing and Management: an International Journal
Load-balancing and caching for collection selection architectures
Proceedings of the 2nd international conference on Scalable information systems
ACM Transactions on Information Systems (TOIS)
Hi-index | 0.00 |
A way to reduce the computing pressure in a distributed IR system is to use document partitioning and to perform collection selection. With suitable training and/or modeling, the collection selection function can choose the most promising collections for each query, with high confidence. Unfortunately, if the collections need to be updated, we need to retrain the selection function, update its statistics or face the loss of some result quality. This paper introduces a simple, but very effective, technique to add new documents to collections in a system that uses collection selection. We show that we can update the individual collections, while guaranteeing the same selection performance, with no need to update or retrain the selection function.