Surviving a search engine overload

  • Authors:
  • Aaron Koehl;Haining Wang

  • Affiliations:
  • College of William and Mary, Williamsburg, VA, USA;College of William and Mary, Williamsburg, VA, USA

  • Venue:
  • Proceedings of the 21st international conference on World Wide Web
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

Search engines are an essential component of the web, but their web crawling agents can impose a significant burden on heavily loaded web servers. Unfortunately, blocking or deferring web crawler requests is not a viable solution due to economic consequences. We conduct a quantitative measurement study on the impact and cost of web crawling agents, seeking optimization points for this class of request. Based on our measurements, we present a practical caching approach for mitigating search engine overload, and implement the two-level cache scheme on a very busy web server. Our experimental results show that the proposed caching framework can effectively reduce the impact of search engine overload on service quality.