Analysis of a very large web search engine query log
ACM SIGIR Forum
Rank-preserving two-level caching for scalable search engines
Proceedings of the 24th annual international ACM SIGIR conference on Research and development in information retrieval
Experiments on Adaptive Set Intersections for Text Retrieval Systems
ALENEX '01 Revised Papers from the Third International Workshop on Algorithm Engineering and Experimentation
Three-level caching for efficient query processing in large Web search engines
WWW '05 Proceedings of the 14th international conference on World Wide Web
The impact of caching on search engines
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Heavy-tailed distributions and multi-keyword queries
SIGIR '07 Proceedings of the 30th annual international ACM SIGIR conference on Research and development in information retrieval
Performance of compressed inverted list caching in search engines
Proceedings of the 17th international conference on World Wide Web
Characteristics of character usage in Chinese Web searching
Information Processing and Management: an International Journal
Improved techniques for result caching in web search engines
Proceedings of the 18th international conference on World wide web
Admission policies for caches of search engine results
SPIRE'07 Proceedings of the 14th international conference on String processing and information retrieval
Modeling static caching in web search engines
ECIR'12 Proceedings of the 34th European conference on Advances in Information Retrieval
On caching search engine query results
Computer Communications
A five-level static cache architecture for web search engines
Information Processing and Management: an International Journal
Hi-index | 0.00 |
Caching technologies have been widely employed to boost the performance of Web search engines. Motivated by the correlation between terms in query logs from a commercial search engine, we explore the idea of a caching scheme based on pairs of terms, rather than individual terms (which is the typical approach used by search engines today). We propose an inverted list caching policy, based on the Least Recently Used method, in which the co-occurring correlation between terms in the query stream is accounted for when deciding on which terms to keep in the cache. We consider not only the term co-occurrence within the same query but also the co-occurrence between separate queries. Experimental results show that the proposed approach can improve not only the cache hit ratio but also the overall throughput of the system when compared to existing list caching algorithms.