Bootstrapping toponym classifiers

  • Authors:
  • David A. Smith;Gideon S. Mann

  • Affiliations:
  • Johns Hopkins University, Baltimore, MD;Johns Hopkins University, Baltimore, MD

  • Venue:
  • HLT-NAACL-GEOREF '03 Proceedings of the HLT-NAACL 2003 workshop on Analysis of geographic references - Volume 1
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present minimally supervised methods for training and testing geographic name disambiguation (GND) systems. We train data-driven place name classifiers using toponyms already disambiguated in the training text --- by such existing cues as "Nashville, Tenn." or "Springfield, MA" --- and test the system on texts where these cues have been stripped out and on hand-tagged historical texts. We experiment on three English-language corpora of varying provenance and complexity: newsfeed from the 1990s, personal narratives from the 19th century American west, and memoirs and records of the U.S. Civil War. Disambiguation accuracy ranges from 87% for news to 69% for some historical collections.