A morphosyntactic Brill Tagger for inflectional languages

  • Authors:
  • Szymon Acedański

  • Affiliations:
  • Institute of Informatics, University of Warsaw, Warszawa, Poland and Institute of Computer Science, Polish Academy of Sciences, Warszawa, Poland

  • Venue:
  • IceTAL'10 Proceedings of the 7th international conference on Advances in natural language processing
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper we present and evaluate a Brill morphosyntactic transformation-based tagger adapted for specifics of highly inflectional languages. Multi-phase tagging with grammatical category matching transformations and lexical transformations brings significant accuracy improvements comparing to previous work. Evaluation shows the accuracy of 92.44% for the Polish language which is higher than the same metric for the other known taggers of Polish: stochastic trigram tagger (90.59%) and hybrid tagger TaKIPI employing decision tree classifier and automatically extracted rule-based tagger used for tagging the IPI PAN Corpus of Polish (91.06%).