Some simple effective approximations to the 2-Poisson model for probabilistic weighted retrieval
SIGIR '94 Proceedings of the 17th annual international ACM SIGIR conference on Research and development in information retrieval
Self-indexing inverted files for fast text retrieval
ACM Transactions on Information Systems (TOIS)
Vector-space ranking with effective early termination
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
A Vector Space Model for Automatic Indexing
A Vector Space Model for Automatic Indexing
Multi-Tier Architecture for Web Search Engines
LA-WEB '03 Proceedings of the First Conference on Latin American Web Congress
Efficient query evaluation using a two-level retrieval process
CIKM '03 Proceedings of the twelfth international conference on Information and knowledge management
Pruned query evaluation using pre-computed impacts
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
Efficient document retrieval in main memory
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Pruning policies for two-tiered inverted index with correctness guarantee
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Modern Information Retrieval
Faster top-k document retrieval using block-max indexes
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Optimized top-k processing with global page scores on block-max indexes
Proceedings of the fifth ACM international conference on Web search and data mining
Hi-index | 0.00 |
In this paper we present two new algorithms designed to reduce the overall time required to process top-k queries. These algorithms are based on the document-at-a-time approach and modify the best baseline we found in the literature, Blockmax WAND (BMW), to take advantage of a two-tiered index, in which the first tier is a small index containing only the higher impact entries of each inverted list. This small index is used to pre-process the query before accessing a larger index in the second tier, resulting in considerable speeding up the whole process. The first algorithm we propose, named BMW-CS, achieves higher performance, but may result in small changes in the top results provided in the final ranking. The second algorithm, named BMW-t, preserves the top results and, while slower than BMW-CS, it is faster than BMW. In our experiments, BMW-CS was more than 40 times faster than BMW when computing top 10 results, and, while it does not guarantee preserving the top results, it preserved all ranking results evaluated at this level.