Boosting the performance of Web search engines: Caching and prefetching query results by exploiting historical usage data

  • Authors:
  • Tiziano Fagni;Raffaele Perego;Fabrizio Silvestri;Salvatore Orlando

  • Affiliations:
  • Consiglio Nazionale delle Ricerche (CNR), Pisa, Italy;Consiglio Nazionale delle Ricerche (CNR), Pisa, Italy;Consiglio Nazionale delle Ricerche (CNR), Pisa, Italy;Università Ca' Foscari di Venezia, Mestre (VE), Italy

  • Venue:
  • ACM Transactions on Information Systems (TOIS)
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

This article discusses efficiency and effectiveness issues in caching the results of queries submitted to a Web search engine (WSE). We propose SDC (Static Dynamic Cache), a new caching strategy aimed to efficiently exploit the temporal and spatial locality present in the stream of processed queries. SDC extracts from historical usage data the results of the most frequently submitted queries and stores them in a static, read-only portion of the cache. The remaining entries of the cache are dynamically managed according to a given replacement policy and are used for those queries that cannot be satisfied by the static portion. Moreover, we improve the hit ratio of SDC by using an adaptive prefetching strategy, which anticipates future requests by introducing a limited overhead over the back-end WSE. We experimentally demonstrate the superiority of SDC over purely static and dynamic policies by measuring the hit ratio achieved on three large query logs by varying the cache parameters and the replacement policy used for managing the dynamic part of the cache. Finally, we deploy and measure the throughput achieved by a concurrent version of our caching system. Our tests show how the SDC cache can be efficiently exploited by many threads that concurrently serve the queries of different users.