Unsupervised induction of natural language morphology inflection classes

  • Authors:
  • Christian Monson;Alon Lavie;Jaime Carbonell;Lori Levin

  • Affiliations:
  • Carnegie Mellon University, Pittsburgh;Carnegie Mellon University, Pittsburgh;Carnegie Mellon University, Pittsburgh;Carnegie Mellon University, Pittsburgh

  • Venue:
  • SIGMorPhon '04 Proceedings of the 7th Meeting of the ACL Special Interest Group in Computational Phonology: Current Themes in Computational Phonology and Morphology
  • Year:
  • 2004

Quantified Score

Hi-index 0.00

Visualization

Abstract

We propose a novel language-independent framework for inducing a collection of morphological inflection classes from a monolingual corpus of full form words. Our approach involves two main stages. In the first stage, we generate a large data structure of candidate inflection classes and their interrelationships. In the second stage, search and filtering techniques are applied to this data structure, to identify a select collection of "true" inflection classes of the language. We describe the basic methodology involved in both stages of our approach and present an evaluation of our baseline techniques applied to induction of major inflection classes of Spanish. The preliminary results on an initial training corpus already surpass an F1 of 0.5 against ideal Spanish inflectional morphology classes.