Urban web crawling

  • Authors:
  • Dirk Ahlers;Susanne Boll

  • Affiliations:
  • OFFIS Institute for Information Technology, Oldenburg, Germany;University of Oldenburg, Germany

  • Venue:
  • Proceedings of the first international workshop on Location and the web
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Local search is increasingly becoming a major focus point of research interest. It is a widely-recognized speciality search with a large application area. Its data is usually aggregated from a variety of sources. One as yet largely untapped source of location data is the WWW. Today, the Web does not explicitly reveal its location-relation; rather this information is hidden somewhere within pages' contents. To exploit such location information, we need to find, extract and geo-spatially index relevant Web pages. For an effective retrieval of such content, this paper examines the application of focused Web crawling to the geospatial domain. We describe our approach for a geo-aware focused crawling of urban areas and other regions with a high building density. We present our experimental results that give us insight into spatial Web information such as location density and link distance between topical pages. Our crawls and evaluations back our hypothesis that geospatially focused crawling is suitable for the urban geospatial topic.