A bootstrapping approach for geographic named entity annotation

  • Authors:
  • Seungwoo Lee;Gary Geunbae Lee

  • Affiliations:
  • Department of Computer Science and Engineering, Pohang University of Science and Technology, Pohang, Korea;Department of Computer Science and Engineering, Pohang University of Science and Technology, Pohang, Korea

  • Venue:
  • AIRS'04 Proceedings of the 2004 international conference on Asian Information Retrieval Technology
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

Geographic named entities can be classified into many sub-types that are useful for applications such as information extraction and question answering. In this paper, we present a bootstrapping algorithm for the task of geographic named entity annotation. In the initial stage, we annotate a raw corpus using seeds. From the initial annotation, boundary patterns are learned and applied to the corpus again to annotate new candidates. Type verification is adopted to reduce over-generation. One sense per discourse principle increases positive instances and also corrects mistaken annotations. As the bootstrapping loop proceeds, the annotated instances are increased gradually and the learned boundary patterns become gradually richer.