Selective recrawling for object-level vertical search

  • Authors:
  • Yaqian Zhou;Mengjing Jiang;Qi Zhang;Xuanjing Huang;Lide Wu

  • Affiliations:
  • Fudan University, Shanghai, China;Fudan University, Shanghai, China;Fudan University, Shanghai, China;Fudan University, Shanghai, China;Fudan University, Shanghai, China

  • Venue:
  • Proceedings of the 19th international conference on World wide web
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we propose a novel recrawling method based on navigation patterns called Selective Recrawling. The goal of selective recrawling is to automatically select page collections that have large coverage and little redundancy to a pre-defined vertical domain. It only requires several seed objects and can select a set of URL patterns to cover most objects. The selected set can be used to recrawl the web pages for quite a period of time and renewed periodically. Experiments on local event data show that our method can greatly reduce the downloading of web pages while keep the comparative object coverage.