On the reuse of past optimal queries
SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Analysis of a very large web search engine query log
ACM SIGIR Forum
Approximation algorithms
Modern Information Retrieval
Predictive caching and prefetching of query results in search engines
WWW '03 Proceedings of the 12th international conference on World Wide Web
Proceedings of the thirty-fifth annual ACM symposium on Theory of computing
A survey of Web cache replacement strategies
ACM Computing Surveys (CSUR)
Hourly analysis of a very large topically categorized web query log
Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval
Universal approximations for TSP, Steiner tree, and set cover
Proceedings of the thirty-seventh annual ACM symposium on Theory of computing
ACM Transactions on Information Systems (TOIS)
InfoScale '06 Proceedings of the 1st international conference on Scalable information systems
Design trade-offs for search engine caching
ACM Transactions on the Web (TWEB)
Content Delivery Networks
Set Covering with our Eyes Closed
FOCS '08 Proceedings of the 2008 49th Annual IEEE Symposium on Foundations of Computer Science
Proceedings of the Second ACM International Conference on Web Search and Data Mining
Improved techniques for result caching in web search engines
Proceedings of the 18th international conference on World wide web
Online Primal-Dual Algorithms for Covering and Packing
Mathematics of Operations Research
Proceedings of the twenty-eighth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems
Quantifying performance and quality gains in distributed web search engines
Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
How are we searching the World Wide Web? A comparison of nine search engine transaction logs
Information Processing and Management: an International Journal - Special issue: Formal methods for information retrieval
Public-key cryptography from different assumptions
Proceedings of the forty-second ACM symposium on Theory of computing
On caching search engine query results
Computer Communications
Document selection for tiered indexing in commerce search
Proceedings of the sixth ACM international conference on Web search and data mining
Proceedings of the sixth ACM international conference on Web search and data mining
Permutation indexing: fast approximate retrieval from large corpora
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Hi-index | 0.00 |
In this paper we introduce the problem of query covering as a means to efficiently cache query results. The general idea is to populate the cache with documents that contribute to the result pages of a large number of queries, as opposed to caching the top documents for each query. It turns out that the problem is hard and solving it requires knowledge of the structure of the queries and the results space, as well as knowledge of the input query distribution. We formulate the problem under the framework of stochastic optimization; theoretically it can be seen as a stochastic universal version of set multicover. While the problem is NP-hard to be solved exactly, we show that for any distribution it can be approximated using a simple greedy approach. Our theoretical findings are complemented by experimental activity on real datasets, showing the feasibility and potential interest of query-covering approaches in practice.