Dependency treelet translation: syntactically informed phrasal SMT

  • Authors:
  • Chris Quirk;Arul Menezes;Colin Cherry

  • Affiliations:
  • Microsoft Research, Redmond, WA;Microsoft Research, Redmond, WA;University of Alberta, Edmonton, Alberta, Canada

  • Venue:
  • ACL '05 Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
  • Year:
  • 2005

Quantified Score

Hi-index 0.00

Visualization

Abstract

We describe a novel approach to statistical machine translation that combines syntactic information in the source language with recent advances in phrasal translation. This method requires a source-language dependency parser, target language word segmentation and an unsupervised word alignment component. We align a parallel corpus, project the source dependency parse onto the target sentence, extract dependency treelet translation pairs, and train a tree-based ordering model. We describe an efficient decoder and show that using these tree-based models in combination with conventional SMT models provides a promising approach that incorporates the power of phrasal SMT with the linguistic generality available in a parser.