Unsupervised learning of Bulgarian POS tags

  • Authors:
  • Derrick Higgins

  • Affiliations:
  • Educational Testing Service

  • Venue:
  • MorphSlav '03 Proceedings of the 2003 EACL Workshop on Morphological Processing of Slavic Languages
  • Year:
  • 2003

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents an approach to the unsupervised learning of parts of speech which uses both morphological and syntactic information. While the model is more complex than those which have been employed for unsupervised learning of POS tags in English, which use only syntactic information, the variety of languages in the world requires that we consider morphology as well. In many languages, morphology provides better clues to a word's category than word order. We present the computational model for POS learning, and present results for applying it to Bulgarian, a Slavic language with relatively free word order and rich morphology.