Hand-Written and automatically extracted rules for polish tagger

  • Authors:
  • Maciej Piasecki

  • Affiliations:
  • Institute of Applied Informatics, Wrocław University of Technology, Wrocław, Poland

  • Venue:
  • TSD'06 Proceedings of the 9th international conference on Text, Speech and Dialogue
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

Stochastic approaches to tagging of Polish brought results far from being satisfactory However, successful combination of hand-written rules and a stochastic approach to Czech, as well, as some initial experiments in acquisition of tagging rules for Polish revealed potential capabilities of a rule based approach The goals are: to define a language of tagging constraints, to construct a set of reduction rules for Polish and to apply Machine Learning to extraction of tagging rules A language of functional tagging constraints called JOSKIPI is proposed An extension to the C4.5 algorithm based on introducing complex JOSKIPI operators into decision trees is presented Construction of a preliminary hand-written tagging rules for Polish is discussed Finally, the results of the comparison of different versions of the tagger are given.