Distortion models for statistical machine translation

  • Authors:
  • Yaser Al-Onaizan;Kishore Papineni

  • Affiliations:
  • IBM T.J. Watson Research Center, Yorktown Heights, NY;IBM T.J. Watson Research Center, Yorktown Heights, NY

  • Venue:
  • ACL-44 Proceedings of the 21st International Conference on Computational Linguistics and the 44th annual meeting of the Association for Computational Linguistics
  • Year:
  • 2006

Quantified Score

Hi-index 0.00

Visualization

Abstract

In this paper, we argue that n-gram language models are not sufficient to address word reordering required for Machine Translation. We propose a new distortion model that can be used with existing phrase-based SMT decoders to address those n-gram language model limitations. We present empirical results in Arabic to English Machine Translation that show statistically significant improvements when our proposed model is used. We also propose a novel metric to measure word order similarity (or difference) between any pair of languages based on word alignments.