Language independent first and last name identification in person names

  • Authors:
  • Octavian Popescu;Bernardo Magnini

  • Affiliations:
  • FBK, Trento, Italy;FBK, Trento, Italy

  • Venue:
  • CICLing'08 Proceedings of the 9th international conference on Computational linguistics and intelligent text processing
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we address the problem of first name and last name identification in a news collection. The approach presented is based on corpus investigation and is language independent. At the core of the system there is a name classifier based on the values of different parameters. In its most general form, the name category identification is not an easy task. The hardest problems are raised by ambiguous tokens - those that can be either a first or a last name and/or by tokens with just one occurrence. However, the system is able to predict the name category with high accuracy. The experiments have been run on an Italian newspaper and the evaluation has been carried on I-CAB.