Knowledge-free induction of morphology using latent semantic analysis

  • Authors:
  • Patrick Schone;Daniel Jurafsky

  • Affiliations:
  • University of Colorado, Boulder, Colorado;University of Colorado, Boulder, Colorado

  • Venue:
  • ConLL '00 Proceedings of the 2nd workshop on Learning language in logic and the 4th conference on Computational natural language learning - Volume 7
  • Year:
  • 2000

Quantified Score

Hi-index 0.03

Visualization

Abstract

Morphology induction is a subproblem of important tasks like automatic learning of machine-readable dictionaries and grammar induction. Previous morphology induction approaches have relied solely on statistics of hypothesized stems and affixes to choose which affixes to consider legitimate. Relying on stem-and-affix statistics rather than semantic knowledge leads to a number of problems, such as the inappropriate use of valid affixes ("ally" stemming to "all"). We introduce a semantic-based algorithm for learning morphology which only proposes affixes when the stem and stem-plus-affix are sufficiently similar semantically. We implement our approach using Latent Semantic Analysis and show that our semantics-only approach provides morphology induction results that rival a current state-of-the-art system.