Ontology-based Web Crawler

  • Authors:
  • S. Ganesh;M. Jayaraj;V. Kalyan;Srinivasa Murthy;G. Aghila

  • Affiliations:
  • -;-;-;-;-

  • Venue:
  • ITCC '04 Proceedings of the International Conference on Information Technology: Coding and Computing (ITCC'04) Volume 2 - Volume 2
  • Year:
  • 2004

Quantified Score

Hi-index 0.01

Visualization

Abstract

The requirement of a web Crawler that downloadsmost relevant pages is still a major challenge in the fieldof Information Retrieval Systems. The use of link analysisalgorithms like page rank and other Importance-metricshave shed a new approach in prioritizing the URL queuefor downloading higher relevant pages. In this paper, thecombination of these metrics along with a new metriccalled association-metric has been proposed. Theassociation-metric estimates the semantic content of theURL based on the domain dependent ontology, which inturn strengthens the metric that is used for prioritizing theURL queue. In addition, after downloading the page, theassociation metric plays important role in estimating therelevancy of the links in that page. The proposed newmetric will solve the major problem of finding therelevancy of the pages before the process of crawling, toan optimal level.