Caching for realtime search

Authors:
Edward Bortnikov;Ronny Lempel;Kolman Vornovitsky
Affiliations:
Yahoo! Labs, Haifa;Yahoo! Labs, Haifa;Technion CS, Haifa
Venue:
ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
Year:
2011

Citing 10
Cited 5

The cache memory book

The cache memory book
Principles of Optimal Page Replacement

Journal of the ACM (JACM)
Predictive caching and prefetching of query results in search engines

WWW '03 Proceedings of the 12th international conference on World Wide Web
Boosting the performance of Web search engines: Caching and prefetching query results by exploiting historical usage data

ACM Transactions on Information Systems (TOIS)
Improved techniques for result caching in web search engines

Proceedings of the 18th international conference on World wide web
Towards recency ranking in web search

Proceedings of the third ACM international conference on Web search and data mining
A refreshing perspective of search engine caching

Proceedings of the 19th international conference on World wide web
Modern Information Retrieval

Modern Information Retrieval
Caching search engine results over incremental indices

Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
On caching search engine query results

Computer Communications

Adaptive time-to-live strategies for query result caching in web search engines

ECIR'12 Proceedings of the 34th European conference on Advances in Information Retrieval
Prefetching query results and its impact on search engines

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Online result cache invalidation for real-time web search

SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Strategies for setting time-to-live values in result caches

Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Second Chance: A Hybrid Approach for Dynamic Result Caching and Prefetching in Search Engines

ACM Transactions on the Web (TWEB)

Quantified Score

Hi-index	0.00

Visualization

Abstract

Modern search engines feature real-time indices, which incorporate changes to content within seconds. As search engines also cache search results for reducing user latency and back-end load, without careful real-time management of search results caches, the engine might return stale search results to users despite the efforts invested in keeping the underlying index up to date. A recent paper proposed an architectural component called CIP - the cache invalidation predictor. CIPs invalidate supposedly stale cache entries upon index modifications. Initial evaluation showed the ability to keep the performance benefits of caching without sacrificing much the freshness of search results returned to users. However, it was conducted on a synthetic workload in a simplified setting, using many assumptions. We propose new CIP heuristics, and evaluate them in an authentic environment - on the real evolving corpus and query stream of a large commercial news search engine. Our CIPs operate in conjunction with realistic cache settings, and we use standard metrics for evaluating cache performance. We show that a classical cache replacement policy, LRU, completely fails to guarantee freshness over time, whereas our CIPs serve 97% of the queries with fresh results. Our policies incur a negligible impact on the baseline's cache hit rate, in contrast with traditional age-based invalidation, which must severely reduce the cache performance in order to achieve the same freshness. We demonstrate that the computational overhead of our algorithms is minor, and that they even allow reducing the cache's memory footprint.