Efficiency trade-offs in two-tier web search systems

  • Authors:
  • Ricardo Baeza-Yates;Vanessa Murdock;Claudia Hauff

  • Affiliations:
  • Yahoo!, Barcelona, Spain;Yahoo!, Barcelona, Spain;University of Twente, Enschede, Netherlands

  • Venue:
  • Proceedings of the 32nd international ACM SIGIR conference on Research and development in information retrieval
  • Year:
  • 2009

Quantified Score

Hi-index 0.01

Visualization

Abstract

Search engines rely on searching multiple partitioned corpora to return results to users in a reasonable amount of time. In this paper we analyze the standard two-tier architecture for Web search with the difference that the corpus to be searched for a given query is predicted in advance. We show that any predictor better than random yields time savings, but this decrease in the processing time yields an increase in the infrastructure cost. We provide an analysis and investigate this trade-off in the context of two different scenarios on real-world data. We demonstrate that in general the decrease in answer time is justified by a small increase in infrastructure cost.