Syntactic clustering of the Web
Selected papers from the sixth international conference on World Wide Web
The anatomy of a large-scale hypertextual Web search engine
WWW7 Proceedings of the seventh international conference on World Wide Web 7
Building a distributed full-text index for the Web
Proceedings of the 10th international conference on World Wide Web
ACM Transactions on Internet Technology (TOIT)
Rank-preserving two-level caching for scalable search engines
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Modern Information Retrieval
Mining the Web: Discovering Knowledge from HyperText Data
Mining the Web: Discovering Knowledge from HyperText Data
The Evolution of the Web and Implications for an Incremental Crawler
VLDB '00 Proceedings of the 26th International Conference on Very Large Data Bases
Predictive caching and prefetching of query results in search engines
WWW '03 Proceedings of the 12th international conference on World Wide Web
Competitive caching of query results in search engines
Theoretical Computer Science - Special issue: Online algorithms in memoriam, Steve Seiden
Three-level caching for efficient query processing in large Web search engines
WWW '05 Proceedings of the 14th international conference on World Wide Web
ACM Transactions on Information Systems (TOIS)
The discoverability of the web
Proceedings of the 16th international conference on World Wide Web
High performance index build algorithms for intranet search engines
VLDB '04 Proceedings of the Thirtieth international conference on Very large data bases - Volume 30
Design trade-offs for search engine caching
ACM Transactions on the Web (TWEB)
Improved techniques for result caching in web search engines
Proceedings of the 18th international conference on World wide web
Admission policies for caches of search engine results
SPIRE'07 Proceedings of the 14th international conference on String processing and information retrieval
On caching search engine query results
Computer Communications
Cost-Aware Strategies for Query Result Caching in Web Search Engines
ACM Transactions on the Web (TWEB)
Timestamp-based cache invalidation for search engines
Proceedings of the 20th international conference companion on World wide web
ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
Timestamp-based result cache invalidation for web search engines
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Search result caching in peer-to-peer information retrieval networks
IRFC'11 Proceedings of the Second international conference on Multidisciplinary information retrieval facility
Assigning documents to master sites in distributed search
Proceedings of the 20th ACM international conference on Information and knowledge management
CAEPIA'11 Proceedings of the 14th international conference on Advances in artificial intelligence: spanish association for artificial intelligence
Adaptive time-to-live strategies for query result caching in web search engines
ECIR'12 Proceedings of the 34th European conference on Advances in Information Retrieval
LePrEF: Learn to precompute evidence fusion for efficient query evaluation
Journal of the American Society for Information Science and Technology
Prefetching query results and its impact on search engines
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Online result cache invalidation for real-time web search
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Cache-Based Query Processing for Search Engines
ACM Transactions on the Web (TWEB)
Strategies for setting time-to-live values in result caches
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Top-k publish-subscribe for social annotation of news
Proceedings of the VLDB Endowment
Second Chance: A Hybrid Approach for Dynamic Result Caching and Prefetching in Search Engines
ACM Transactions on the Web (TWEB)
Improving the efficiency of multi-site web search engines
Proceedings of the 7th ACM international conference on Web search and data mining
Hi-index | 0.00 |
A Web search engine must update its index periodically to incorporate changes to the Web. We argue in this paper that index updates fundamentally impact the design of search engine result caches, a performance-critical component of modern search engines. Index updates lead to the problem of cache invalidation: invalidating cached entries of queries whose results have changed. Naive approaches, such as flushing the entire cache upon every index update, lead to poor performance and in fact, render caching futile when the frequency of updates is high. Solving the invalidation problem efficiently corresponds to predicting accurately which queries will produce different results if re-evaluated, given the actual changes to the index. To obtain this property, we propose a framework for developing invalidation predictors and define metrics to evaluate invalidation schemes. We describe concrete predictors using this framework and compare them against a baseline that uses a cache invalidation scheme based on time-to-live (TTL). Evaluation over Wikipedia documents using a query log from the Yahoo! search engine shows that selective invalidation of cached search results can lower the number of unnecessary query evaluations by as much as 30% compared to a baseline scheme, while returning results of similar freshness. In general, our predictors enable fewer unnecessary invalidations and fewer stale results compared to a TTL-only scheme for similar freshness of results.