A probabilistic geocoding system utilising a parcel based address file

  • Authors:
  • Peter Christen;Alan Willmore;Tim Churches

  • Affiliations:
  • Department of Computer Science, Australian National University, Canberra, ACT, Australia;New South Wales Department of Health, Centre for Epidemiology and Research, North Sydney, NSW, Australia;New South Wales Department of Health, Centre for Epidemiology and Research, North Sydney, NSW, Australia

  • Venue:
  • Data Mining
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

It is estimated that between 80% and 90% of governmental data collections contain address information. Geocoding – the process of assigning geographic coordinates to addresses – is becoming increasingly important in application areas that involve the analysis and mining of such data. In many cases, address records are captured and/or stored in a free-form or inconsistent manner. This fact complicates the task of accurately matching such addresses to spatially-annotated reference data. In this paper we describe a geocoding system that is based on a comprehensive high-quality geocoded national address database. It uses a learning address parser based on hidden Markov models to segment free-form addresses into components, and a rule-based matching engine to determine the best matches to the reference database.