Cross-lingual propagation for morphological analysis

  • Authors:
  • Benjamin Snyder;Regina Barzilay

  • Affiliations:
  • Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology;Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology

  • Venue:
  • AAAI'08 Proceedings of the 23rd national conference on Artificial intelligence - Volume 2
  • Year:
  • 2008

Quantified Score

Hi-index 0.00

Visualization

Abstract

Multilingual parallel text corpora provide a powerful means for propagating linguistic knowledge across languages. We present a model which jointly learns linguistic structure for each language while inducing links between them. Our model supports fully symmetrical knowledge transfer, utilizing any combination of supervised and unsupervised data across language barriers. The proposed non-parametric Bayesian model effectively combines cross-lingual alignment with target language predictions. This architecture is a potent alternative to projection methods which decompose these decisions into two separate stages. We apply this approach to the task of morphological segmentation, where the goal is to separate a word into its individual morphemes. When tested on a parallel corpus of Hebrew and Arabic, our joint bilingual model effectively incorporates all available evidence from both languages, yielding significant performance gains.