Using corpus-derived name lists for named entity recognition

  • Authors:
  • Mark Stevenson;Robert Gaizauskas

  • Affiliations:
  • University of Sheffield, Sheffield, United Kingdom;University of Sheffield, Sheffield, United Kingdom

  • Venue:
  • ANLC '00 Proceedings of the sixth conference on Applied natural language processing
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper describes experiments to establish the performance of a named entity recognition system which builds categorized lists of names from manually annotated training data. Names in text are then identified using only these lists. This approach does not perform as well as state-of-the-art named entity recognition systems. However, we then show that by using simple filtering techniques for improving the automatically acquired lists, substantial performance benefits can be achieved, with resulting F-measure scores of 87% on a standard test set. These results provide a baseline against which the contribution of more sophisticated supervised learning techniques for NE recognition should be measured.