Automated conflation of digital gazetteer data

  • Authors:
  • J. T. Hastings

  • Affiliations:
  • Department of Geography, University of California, Santa Barbara, CA, USA

  • Venue:
  • International Journal of Geographical Information Science - Digital Gazetteer Research
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

A digital gazetteer (DG) is a spatial dictionary of named and typed places in some environment, typically the near-surface of the Earth. DGs are proliferating in number and sophistication with the popularity of location-based services such as GoogleEarth, MapQuest, and OnStar. The essential utility of a DG is to translate between formal and informal systems of place referencing, i.e. between the ad hoc names and qualitative type classifications assigned to places, on the one hand, and quantitative locations for them, on the other. Frequently, it is necessary to consult and combine results from multiple sources of gazetteer data, which is tedious for humans and currently not done by machines. Thus, a fundamental challenge with DGs is conflation: merging gazetteer data so that place identity is preserved. The challenge can be met using a computational approach modelled on human behaviour, focusing first on places' geometries (since disjoint places cannot be the same), second on their type categories, and finally on their names. This article details a troika of metrics that mimic the human cognitive process, together with operational procedures for automated conflation of DG data using them. By way of demonstration, both abstract and practical results of conflation for the Lake Tahoe Basin of California and Nevada are presented.