Heuristic methods for reducing errors of geographic named entities learned by bootstrapping

  • Authors:
  • Seungwoo Lee;Gary Geunbae Lee

  • Affiliations:
  • Department of Computer Science and Engineering, Pohang University of Science and Technology, Pohang, Republic of Korea;Department of Computer Science and Engineering, Pohang University of Science and Technology, Pohang, Republic of Korea

  • Venue:
  • IJCNLP'05 Proceedings of the Second international joint conference on Natural Language Processing
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

One of issues in the bootstrapping for named entity recognition is how to control annotation errors introduced at every iteration. In this paper, we present several heuristics for reducing such errors using external resources such as WordNet, encyclopedia and Web documents. The bootstrapping is applied for identifying and classifying fine-grained geographic named entities, which are useful for applications such as information extraction and question answering, as well as standard named entities such as PERSON and ORGANIZATION. The experiments show the usefulness of the suggested heuristics and the learning curve evaluated at each bootstrapping loop. When our approach was applied to a newspaper corpus, it could achieve 87 F1 value, which is quite promising for the fine-grained named entity recognition task.