Efficient query subscription processing for prospective search engines

  • Authors:
  • Utku Irmak;Svilen Mihaylov;Torsten Suel;Samrat Ganguly;Rauf Izmailov

  • Affiliations:
  • Polytechnic University;University of Pennsylvania;Polytechnic University;NEC Laboratories America;NEC Laboratories America

  • Venue:
  • ATEC '06 Proceedings of the annual conference on USENIX '06 Annual Technical Conference
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Current web search engines are retrospective in that they limit users to searches against already existing pages. Prospective search engines, on the other hand, allow users to upload queries that will be applied to newly discovered pages in the future. Some examples of prospective search are the subscription features in Google News and in RSS-based blog search engines. In this paper, we study the problem of efficiently processing large numbers of keyword query subscriptions against a stream of newly discovered documents, and propose several query processing optimizations for prospective search. Our experimental evaluation shows that these techniques can improve the throughput of a well known algorithm by more than a factor of 20, and allow matching hundreds or thousands of incoming documents per second against millions of subscription queries per node.