Principles of database buffer management
ACM Transactions on Database Systems (TODS)
Data caching issues in an information retrieval system
ACM Transactions on Database Systems (TODS)
Data cache management using frequency-based replacement
SIGMETRICS '90 Proceedings of the 1990 ACM SIGMETRICS conference on Measurement and modeling of computer systems
Caching and database scaling in distributed shared-nothing information retrieval systems
SIGMOD '93 Proceedings of the 1993 ACM SIGMOD international conference on Management of data
On the reuse of past optimal queries
SIGIR '95 Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval
Interaction of query evaluation and buffer management for information retrieval
SIGMOD '98 Proceedings of the 1998 ACM SIGMOD international conference on Management of data
Principles of Optimal Page Replacement
Journal of the ACM (JACM)
ACM Computing Surveys (CSUR)
ACM Computing Surveys (CSUR)
Rank-preserving two-level caching for scalable search engines
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Lessons from Giant-Scale Services
IEEE Internet Computing
IEEE Transactions on Computers
IEEE Transactions on Knowledge and Data Engineering
Predictive caching and prefetching of query results in search engines
WWW '03 Proceedings of the 12th international conference on World Wide Web
A survey of Web cache replacement strategies
ACM Computing Surveys (CSUR)
Web Caching And Its Applications (Kluwer International Series in Engineering and Computer Science)
Web Caching And Its Applications (Kluwer International Series in Engineering and Computer Science)
Three-level caching for efficient query processing in large Web search engines
WWW '05 Proceedings of the 14th international conference on World Wide Web
ACM Transactions on Information Systems (TOIS)
Exploring the bounds of web latency reduction from caching and prefetching
USITS'97 Proceedings of the USENIX Symposium on Internet Technologies and Systems on USENIX Symposium on Internet Technologies and Systems
Cost-aware WWW proxy caching algorithms
USITS'97 Proceedings of the USENIX Symposium on Internet Technologies and Systems on USENIX Symposium on Internet Technologies and Systems
The impact of caching on search engines
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Static query result caching revisited
Proceedings of the 17th international conference on World Wide Web
Improved techniques for result caching in web search engines
Proceedings of the 18th international conference on World wide web
On caching search engine query results
Computer Communications
Query forwarding in geographically distributed search engines
Proceedings of the 33rd international ACM SIGIR conference on Research and development in information retrieval
Proceedings of the VLDB Endowment
Batch query processing for web search engines
Proceedings of the fourth ACM international conference on Web search and data mining
Document assignment in multi-site search engines
Proceedings of the fourth ACM international conference on Web search and data mining
Cost-Aware Strategies for Query Result Caching in Web Search Engines
ACM Transactions on the Web (TWEB)
Timestamp-based cache invalidation for search engines
Proceedings of the 20th international conference companion on World wide web
ECIR'11 Proceedings of the 33rd European conference on Advances in information retrieval
Posting list intersection on multicore architectures
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Timestamp-based result cache invalidation for web search engines
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Energy-price-driven query processing in multi-center web search engines
Proceedings of the 34th international ACM SIGIR conference on Research and development in Information Retrieval
Replicated partitioning for undirected hypergraphs
Journal of Parallel and Distributed Computing
Adaptive time-to-live strategies for query result caching in web search engines
ECIR'12 Proceedings of the 34th European conference on Advances in Information Retrieval
A five-level static cache architecture for web search engines
Information Processing and Management: an International Journal
Prefetching query results and its impact on search engines
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Online result cache invalidation for real-time web search
SIGIR '12 Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
Cache-Based Query Processing for Search Engines
ACM Transactions on the Web (TWEB)
Document replication strategies for geographically distributed web search engines
Information Processing and Management: an International Journal
Materialization of web data sources
Search Computing
Adaptive parallelism for web search
Proceedings of the 8th ACM European Conference on Computer Systems
Rank-energy selective query forwarding for distributed search systems
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Cache refreshing for online social news feeds
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
Strategies for setting time-to-live values in result caches
Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
A term-based inverted index partitioning model for efficient distributed query processing
ACM Transactions on the Web (TWEB)
Second Chance: A Hybrid Approach for Dynamic Result Caching and Prefetching in Search Engines
ACM Transactions on the Web (TWEB)
Improving the efficiency of multi-site web search engines
Proceedings of the 7th ACM international conference on Web search and data mining
Hi-index | 0.00 |
Commercial Web search engines have to process user queries over huge Web indexes under tight latency constraints. In practice, to achieve low latency, large result caches are employed and a portion of the query traffic is served using previously computed results. Moreover, search engines need to update their indexes frequently to incorporate changes to the Web. After every index update, however, the content of cache entries may become stale, thus decreasing the freshness of served results. In this work, we first argue that the real problem in today's caching for large-scale search engines is not eviction policies, but the ability to cope with changes to the index, i.e., cache freshness. We then introduce a novel algorithm that uses a time-to-live value to set cache entries to expire and selectively refreshes cached results by issuing refresh queries to back-end search clusters. The algorithm prioritizes the entries to refresh according to a heuristic that combines the frequency of access with the age of an entry in the cache. In addition, for setting the rate at which refresh queries are issued, we present a mechanism that takes into account idle cycles of back-end servers. Evaluation using a real workload shows that our algorithm can achieve hit rate improvements as well as reduction in average hit ages. An implementation of this algorithm is currently in production use at Yahoo!.