Simulation Study of Language Specific Web Crawling

  • Authors:
  • Kulwadee Somboonviwat;Masaru Kitsuregawa;Takayuki Tamura

  • Affiliations:
  • Institute of Industrial Science, University of Tokyo;Institute of Industrial Science, University of Tokyo;Mitsubishi Electric Corporation

  • Venue:
  • ICDEW '05 Proceedings of the 21st International Conference on Data Engineering Workshops
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

The Web has been recognized as an important part of our cultural heritage. Many nations started archiving national web spaces for future generations. A key technology for data acquisition employed by these archiving projects is web crawling. Crawling cultural and/or linguistic specific resources from the borderless Web raises many challenging issues. In this paper, we propose the language specific web crawling and evaluate the language specific crawling strategies on the web crawling simulator.