HAWK: A Focused Crawler with Content and Link Analysis

  • Authors:
  • Xiaoyun Chen;Xin Zhang

  • Affiliations:
  • -;-

  • Venue:
  • ICEBE '08 Proceedings of the 2008 IEEE International Conference on e-Business Engineering
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Maintaining currency of search engine indices by exhaustive crawling is rapidly becoming impossible due to the increasing size of the web. Focused crawlers aim to search only the subset of the web related to a specific topic, and offer a potential solution to the problem. But it also has problems. The major problem is how to retrieve the maximal set of relevant and quality pages. To address this problem we design a focused crawler (we call it HAWK) that not only uses content of web page to improve page relevance, but also uses link structure to improve the coverage of a specific topic.