Unsupervised morpheme analysis with allomorfessor

  • Authors:
  • Sami Virpioja;Oskar Kohonen;Krista Lagus

  • Affiliations:
  • Adaptive Informatics Research Centre, Aalto University School of Science and Technology;Adaptive Informatics Research Centre, Aalto University School of Science and Technology;Adaptive Informatics Research Centre, Aalto University School of Science and Technology

  • Venue:
  • CLEF'09 Proceedings of the 10th cross-language evaluation forum conference on Multilingual information access evaluation: text retrieval experiments
  • Year:
  • 2009

Quantified Score

Hi-index 0.00

Visualization

Abstract

Allomorfessor extends the unsupervised morpheme segmentation method Morfessor to account for the linguistic phenomenon of allomorphy, where one morpheme has several different surface forms. The method discovers common base forms for allomorphs from an unannotated corpus by finding small modifications, called mutations, for them. Using Maximum a Posteriori estimation, the model is able to decide the amount and types of the mutations needed for the particular language. In Morpho Challenge 2009 evaluations, the effect of the mutations was discovered to be rather small. However, Allomorfessor performed generally well, achieving the best results for English in the linguistic evaluation, and being in the top three in the application evaluations for all languages.