Performance and cost tradeoffs in Web search

  • Authors:
  • Nick Craswell;Francis Crimmins;David Hawking;Alistair Moffat

  • Affiliations:
  • CSIRO -- ICT Centre, Canberra, ACT, Australia;CSIRO -- ICT Centre, Canberra, ACT, Australia;CSIRO -- ICT Centre, Canberra, ACT, Australia;The University of Melbourne, Victoria, Australia

  • Venue:
  • ADC '04 Proceedings of the 15th Australasian database conference - Volume 27
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Web search engines crawl the web to fetch the data that they index. In this paper we re-examine that need, and evaluate the network costs associated with data acquisition, and alternative ways in which a search service might be supported. As a concrete example, we make use of the Research Finder search service provided at http://rf.panopticsearch.com, and information derived from its crawl and query logs. Based upon an analysis of the Research Finder system we introduce a hybrid arrangement, in which queries are evaluated partially by reference to a centrally maintained index representing a subset of the collection, and partially by referring them on to the local search services maintained by the balance of the collection. We also examine various ways in which crawling costs can be reduced.