Minimally supervised morphological analysis by multimodal alignment

  • Authors:
  • David Yarowsky;Richard Wicentowski

  • Affiliations:
  • Johns Hopkins University, Baltimore, MD;Johns Hopkins University, Baltimore, MD

  • Venue:
  • ACL '00 Proceedings of the 38th Annual Meeting on Association for Computational Linguistics
  • Year:
  • 2000

Quantified Score

Hi-index 0.00

Visualization

Abstract

This paper presents a corpus-based algorithm capable of inducing inflectional morphological analyses of both regular and highly irregular forms (such as brought→bring) from distributional patterns in large monolingual text with no direct supervision. The algorithm combines four original alignment models based on relative corpus frequency, contextual similarity, weighted string similarity and incrementally retrained inflectional transduction probabilities. Starting with no paired examples for training and no prior seeding of legal morphological transformations, accuracy of the induced analyses of 3888 past-tense test cases in English exceeds 99.2% for the set, with currently over 80% accuracy on the most highly irregular forms and 99.7% accuracy on forms exhibiting non-concatenative suffixation.