Exploiting query term correlation for list caching in web search engines

  • Authors:
  • Jiancong Tong;Gang Wang;Douglas S. Stones;Shizhao Sun;Xiaoguang Liu;Fan Zhang

  • Affiliations:
  • Nankai University, Tianjin, China;Nankai University, Tianjin, China;Monash University, Melbourne, Australia;Nankai University, Tianjin, China;Nankai University, Tianjin, China;Nankai University, Tianjin, China

  • Venue:
  • Proceedings of the 22nd ACM international conference on Conference on information & knowledge management
  • Year:
  • 2013

Quantified Score

Hi-index 0.00

Visualization

Abstract

Caching technologies have been widely employed to boost the performance of Web search engines. Motivated by the correlation between terms in query logs from a commercial search engine, we explore the idea of a caching scheme based on pairs of terms, rather than individual terms (which is the typical approach used by search engines today). We propose an inverted list caching policy, based on the Least Recently Used method, in which the co-occurring correlation between terms in the query stream is accounted for when deciding on which terms to keep in the cache. We consider not only the term co-occurrence within the same query but also the co-occurrence between separate queries. Experimental results show that the proposed approach can improve not only the cache hit ratio but also the overall throughput of the system when compared to existing list caching algorithms.