Mapping geographic coverage of the web

  • Authors:
  • Robert Pasley;Paul Clough;Ross S. Purves;Florian A. Twaroch

  • Affiliations:
  • University of Sheffield, Sheffield, UK;University of Sheffield, Sheffield, UK;University of Zurich, Switzerland;Cardiff University, Cardiff, UK

  • Venue:
  • Proceedings of the 16th ACM SIGSPATIAL international conference on Advances in geographic information systems
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we describe a methodology to estimate the geographic coverage of the web without the need for secondary knowledge or complex geo-tagging. This is achieved by randomly selecting toponyms from the Ordnance Survey 50K gazetteer to create search queries and thus gather document counts from various web sources for Great Britain. The same gazetteer is then used to geo-code the results and enable mapping. To validate our approach, and demonstrate the effects of geo/non-geo and geo/geo ambiguity, we mapped the selected toponyms to Geograph, a community project that contains user generated geo-tagged photographs of the UK. Although success varies with resolution, the proposed approach is likely sufficient to be reliably used by applications exploring the geographic coverage of the web for cases where references to settlements are likely to be common. In our case, we applied the method to produce maps of web coverage for a range of sources at a resolution of 30km.