Nonparametric word segmentation for machine translation

  • Authors:
  • ThuyLinh Nguyen;Stephan Vogel;Noah A. Smith

  • Affiliations:
  • Carnegie Mellon University;Carnegie Mellon University;Carnegie Mellon University

  • Venue:
  • COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

We present an unsupervised word segmentation model for machine translation. The model uses existing monolingual segmentation techniques and models the joint distribution over source sentence segmentations and alignments to the target sentence. During inference, the monolingual segmentation model and the bilingual word alignment model are coupled so that the alignments to the target sentence guide the segmentation of the source sentence. The experiments show improvements on Arabic-English and Chinese-English translation tasks.