Lemmatisation as a tagging task

  • Authors:
  • Andrea Gesmundo;Tanja Samardžić

  • Affiliations:
  • University of Geneva;University of Geneva

  • Venue:
  • ACL '12 Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Short Papers - Volume 2
  • Year:
  • 2012

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present a novel approach to the task of word lemmatisation. We formalise lemmatisation as a category tagging task, by describing how a word-to-lemma transformation rule can be encoded in a single label and how a set of such labels can be inferred for a specific language. In this way, a lemmatisation system can be trained and tested using any supervised tagging model. In contrast to previous approaches, the proposed technique allows us to easily integrate relevant contextual information. We test our approach on eight languages reaching a new state-of-the-art level for the lemmatisation task.