Lemmatization of Polish person names

  • Authors:
  • Jakub Piskorski;Marcin Sydow;Anna Kupść

  • Affiliations:
  • Joint Research Centre, Ispra, Italy;Polish-Japanese Institute of Information Technology, Warsaw, Poland;Université Paris, Paris Cedex

  • Venue:
  • ACL '07 Proceedings of the Workshop on Balto-Slavonic Natural Language Processing: Information Extraction and Enabling Technologies
  • Year:
  • 2007

Quantified Score

Hi-index 0.00

Visualization

Abstract

The paper presents two techniques for lemmatization of Polish person names. First, we apply a rule-based approach which relies on linguistic information and heuristics. Then, we investigate an alternative knowledge-poor method which employs string distance measures. We provide an evaluation of the adopted techniques using a set of newspaper texts.