Accuracy of Baseline and Complex Methods Applied to Morphosyntactic Tagging of Polish

  • Authors:
  • Marcin Kuta;Michał Wrzeszcz;Paweł Chrząszcz;Jacek Kitowski

  • Affiliations:
  • Institute of Computer Science, AGH-UST, Kraków, Poland;Institute of Computer Science, AGH-UST, Kraków, Poland;Institute of Computer Science, AGH-UST, Kraków, Poland;Institute of Computer Science, AGH-UST, Kraków, Poland

  • Venue:
  • ICCS '08 Proceedings of the 8th international conference on Computational Science, Part I
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

The paper presents baseline and complex part-of-speech taggers applied to the modified corpus of Frequency Dictionary of Contemporary Polish. Accuracy of 5 baseline part-of-speech taggers is reported. On the base of these results complex methods are worked out. Thematic split and attribute split methods are proposed and evaluated. Tagging accuracy of voting methods is evaluated finally. The most accurate baseline taggers are SVMTool (for a simple tagset) and fnTBL (for a complex tagset). Voting method called Total Precision achieves the top accuracy among all looked over methods.