Query evaluation: strategies and optimizations
Information Processing and Management: an International Journal
Optimization strategies for complex queries
Proceedings of the 28th annual international ACM SIGIR conference on Research and development in information retrieval
Load balancing for term-distributed parallel retrieval
SIGIR '06 Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval
A pipelined architecture for distributed text query evaluation
Information Retrieval
Mining query logs to optimize index partitioning in parallel web search engines
Proceedings of the 2nd international conference on Scalable information systems
Inverted index compression and query processing with optimized document ordering
Proceedings of the 18th international conference on World wide web
Information Retrieval: Implementing and Evaluating Search Engines
Information Retrieval: Implementing and Evaluating Search Engines
A combined semi-pipelined query processing architecture for distributed full-text retrieval
WISE'10 Proceedings of the 11th international conference on Web information systems engineering
Efficient compressed inverted index skipping for disjunctive text-queries
ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
Space-Limited ranked query evaluation using adaptive pruning
WISE'05 Proceedings of the 6th international conference on Web Information Systems Engineering
Intra-query concurrent pipelined processing for distributed full-text retrieval
ECIR'12 Proceedings of the 34th European conference on Advances in Information Retrieval
A term-based inverted index partitioning model for efficient distributed query processing
ACM Transactions on the Web (TWEB)
Hi-index | 0.00 |
Web search engines need to provide high throughput and short query latency. Recent results show that pipelined query processing over a term-wise partitioned inverted index may have superior throughput. However, the query processing latency and scalability with respect to the collections size are the main challenges associated with this method. In this paper, we evaluate the effect of inverted index skipping on the performance of pipelined query processing. Further, we introduce a novel idea of using Max-Score pruning within pipelined query processing and a new term assignment heuristic, partitioning by Max-Score. Our current results indicate a significant improvement over the state-of-the-art approach and lead to several further optimizations, which include dynamic load balancing, intra-query concurrent processing and a hybrid combination between pipelined and non-pipelined execution.