Syntax based reordering with automatically derived rules for improved statistical machine translation

  • Authors:
  • Karthik Visweswariah;Jiri Navratil;Jeffrey Sorensen;Vijil Chenthamarakshan;Nanda Kambhatla

  • Affiliations:
  • IBM Research;IBM Research;Google, Inc.;IBM Research;IBM Research

  • Venue:
  • COLING '10 Proceedings of the 23rd International Conference on Computational Linguistics
  • Year:
  • 2010

Quantified Score

Hi-index 0.00

Visualization

Abstract

Syntax based reordering has been shown to be an effective way of handling word order differences between source and target languages in Statistical Machine Translation (SMT) systems. We present a simple, automatic method to learn rules that reorder source sentences to more closely match the target language word order using only a source side parse tree and automatically generated alignments. The resulting rules are applied to source language inputs as a pre-processing step and demonstrate significant improvements in SMT systems across a variety of languages pairs including English to Hindi, English to Spanish and English to French as measured on a variety of internal test sets as well as a public test set.