Assessing the Certainty of Locations Produced by an Address Geocoding System

  • Authors:
  • Clodoveu A. Davis, Jr;Frederico T. Fonseca

  • Affiliations:
  • Instituto de Informática, Pontifícia Universidade Católica de Minas Gerais, Belo Horizonte, Brazil;College of Information Sciences and Technology, The Pennsylvania State University, University Park, USA

  • Venue:
  • Geoinformatica
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

Addresses are the most common georeferencing resource people use to communicate to others a location within a city. Urban GIS applications that receive data directly from citizens, or from legacy information systems, need to be able to quickly and efficiently obtain a spatial location from addresses. In this paper we understand addresses in a broader perspective, in which not only the conventional elements of postal addresses are considered, but other kinds of direct or indirect references to places, such as building names, postal codes, or telephone area codes, which are also valuable as locators to urban places. This broader view on addresses allows us to work with two perspectives. First, in the ontological definition, modeling, and implementation of an addressing database that is flexible enough to accommodate the variety of concepts and address formats used worldwide, along with direct and indirect references to places. Second, in the definition of an indicator that is able to quantify the degree of certainty that could be reached when a user-given, semi-structured address is geocoded into a spatial position, as a function of the type and completeness of the available addressing data and of the geocoding method that has been employed. This indicator, which we call Geocoding Certainty Indicator (GCI), can be used as a threshold, beyond which the geocoded event should be left out of any statistical analysis, or as a weight that allows spatial analysis methods to reduce the influence of events that have been less reliably located. In order to support geocoding activities and the determination of the GCI, we propose a conceptual schema for addressing databases. The schema is flexible enough to accommodate a variety of addressing systems, at various levels of detail, and in different countries. Our intention is to depart from the usual geocoding strategy employed in commercial GIS products, which is usually limited to the average American or British address format. The schema also extends the notion of postal address to something broader, including popular names for places, building names, reference places, and other concepts. This approach extends Simpson's and Yu's Comput. Environ. Urban Syst., 27: 283---307, 2003 work on postal codes to records of any kind, including place names and loosely formatted addresses.