Semi-supervised learning of concatenative morphology

  • Authors:
  • Oskar Kohonen;Sami Virpioja;Krista Lagus

  • Affiliations:
  • Aalto University, AALTO, Finland;Aalto University, AALTO, Finland;Aalto University, AALTO, Finland

  • Venue:
  • SIGMORPHON '10 Proceedings of the 11th Meeting of the ACL Special Interest Group on Computational Morphology and Phonology
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

We consider morphology learning in a semi-supervised setting, where a small set of linguistic gold standard analyses is available. We extend Morfessor Baseline, which is a method for unsupervised morphological segmentation, to this task. We show that known linguistic segmentations can be exploited by adding them into the data likelihood function and optimizing separate weights for unlabeled and labeled data. Experiments on English and Finnish are presented with varying amount of labeled data. Results of the linguistic evaluation of Morpho Challenge improve rapidly already with small amounts of labeled data, surpassing the state-of-the-art unsupervised methods at 1000 labeled words for English and at 100 labeled words for Finnish.