Filtered document retrieval with frequency-sorted indexes
Journal of the American Society for Information Science
Compressed inverted files with reduced decoding overheads
Proceedings of the 21st annual international ACM SIGIR conference on Research and development in information retrieval
Managing gigabytes (2nd ed.): compressing and indexing documents and images
Managing gigabytes (2nd ed.): compressing and indexing documents and images
Efficient passage ranking for document databases
ACM Transactions on Information Systems (TOIS)
Probe, count, and classify: categorizing hidden web databases
SIGMOD '01 Proceedings of the 2001 ACM SIGMOD international conference on Management of data
Rank-preserving two-level caching for scalable search engines
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Web caching with request reordering
SODA '02 Proceedings of the thirteenth annual ACM-SIAM symposium on Discrete algorithms
Modern Information Retrieval
Fast Algorithms for Mining Association Rules in Large Databases
VLDB '94 Proceedings of the 20th International Conference on Very Large Data Bases
Predictive caching and prefetching of query results in search engines
WWW '03 Proceedings of the 12th international conference on World Wide Web
Design and Implementation of a High-Performance Distributed Web Crawler
ICDE '02 Proceedings of the 18th International Conference on Data Engineering
New results on web caching with request reordering
Proceedings of the sixteenth annual ACM symposium on Parallelism in algorithms and architectures
Three-level caching for efficient query processing in large Web search engines
WWW '05 Proceedings of the 14th international conference on World Wide Web
Inverted files for text search engines
ACM Computing Surveys (CSUR)
MapReduce: simplified data processing on large clusters
OSDI'04 Proceedings of the 6th conference on Symposium on Opearting Systems Design & Implementation - Volume 6
Dryad: distributed data-parallel programs from sequential building blocks
Proceedings of the 2nd ACM SIGOPS/EuroSys European Conference on Computer Systems 2007
The impact of caching on search engines
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Pruning policies for two-tiered inverted index with correctness guarantee
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Heavy-tailed distributions and multi-keyword queries
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Optimizing result prefetching in web search engines with segmented indices
VLDB '02 Proceedings of the 28th international conference on Very Large Data Bases
Optimized query execution in large search engines with global page ordering
VLDB '03 Proceedings of the 29th international conference on Very large data bases - Volume 29
Performance of compressed inverted list caching in search engines
Proceedings of the 17th international conference on World Wide Web
The Generalized Maximum Coverage Problem
Information Processing Letters
Challenges in building large-scale information retrieval systems: invited talk
Proceedings of the Second ACM International Conference on Web Search and Data Mining
Top-k aggregation using intersections of ranked inputs
Proceedings of the Second ACM International Conference on Web Search and Data Mining
Inverted index compression and query processing with optimized document ordering
Proceedings of the 18th international conference on World wide web
Improved techniques for result caching in web search engines
Proceedings of the 18th international conference on World wide web
A study of replacement algorithms for a virtual-storage computer
IBM Systems Journal
Information Processing and Management: an International Journal
A refreshing perspective of search engine caching
Proceedings of the 19th international conference on World wide web
Scalable techniques for document identifier assignment in inverted indexes
Proceedings of the 19th international conference on World wide web
A large-scale active learning system for topical categorization on the web
Proceedings of the 19th international conference on World wide web
Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining
An evaluation of fault-tolerant query processing for web search engines
Euro-Par'11 Proceedings of the 17th international conference on Parallel processing - Volume Part I
Prefetching query results and its impact on search engines
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Hi-index | 0.00 |
Large web search engines are now processing billions of queries per day. Most of these queries are interactive in nature, requiring a response in fractions of a second. However, there are also a number of important scenarios where large batches of queries are submitted for various web mining and system optimization tasks that do not require an immediate response. Given the significant cost of executing search queries over billions of web pages, it is a natural question to ask if such batches of queries can be more efficiently executed than interactive queries. In this paper, we motivate and discuss the problem of batch query processing in search engines, identify basic mechanisms for improving the performance of such queries, and provide a preliminary experimental evaluation of the proposed techniques. Our conclusion is that significant cost reductions are possible by using specialized mechanisms for executing batch queries in Web search engines.