e-dictionaries and finite-state automata for the recognition of named entities

  • Authors:
  • Cvetana Krstev;Duško Vitas;Ivan Obradović;Miloš Utvić

  • Affiliations:
  • University of Belgrade;University of Belgrade;University of Belgrade;University of Belgrade

  • Venue:
  • FSMNLP '11 Proceedings of the 9th International Workshop on Finite State Methods and Natural Language Processing
  • Year:
  • 2011

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we present a system for named entity recognition and tagging in Serbian that relies on large-scale lexical resources and finite-state transducers. Our system recognizes several types of name, temporal and numerical expressions. Finite-state automata are used to describe the context of named entities, thus improving the precision of recognition. The widest context was used for personal names and it included the recognition of nominal phrases describing a person's position. For the evaluation of the named entity recognition system we used a corpus of 2,300 short agency news. Through manual evaluation we precisely identified all omissions and incorrect recognitions which enabled the computation of recall and precision. The overall recall R = 0.84 for types and R = 0.93 for tokens, and overall precision P = 0.95 for types and P = 0.98 for tokens show that our system gives priority to precision.